Antibody libraries

ABSTRACT

The present invention relates to the production of antibody libraries. In particular, the present invention relates to the use of integrating retroviral vectors to generate libraries comprising a plurality of combinations of antibody light chains and heavy chains. The present invention thus provides improved methods of generating and screening antibody libraries comprising large numbers of unique antibodies.

[0001] This application claims priority to provisional patent applications serial Nos. 60/368,808, filed Mar. 28, 2002 and 60/371,299, filed Apr. 10, 2002; each of which is herein incorporated by reference in its entirety.

FIELD OF THE INVENTION

[0002] The present invention relates to the production of antibody libraries. In particular, the present invention relates to the use of integrating retroviral vectors to generate libraries comprising a plurality of combinations of antibody light chains and heavy chains

BACKGROUND OF THE INVENTION

[0003] The pharmaceutical biotechnology industry is based on the production of recombinant proteins in mammalian cells. These proteins are essential to the therapeutic treatment of many diseases and conditions. In particular, antibodies are of increasing importance in human therapy, assay procedures and diagnostic methods. However, methods of identifying antibodies and production of antibodies is often expensive, particularly where monoclonal antibodies are required. Hybridoma technology has traditionally been employed to produce monoclonal antibodies, but these methods are time-consuming and result in isolation and production of limited numbers of specific antibodies. Additionally, relatively small amounts of antibody are produced; consequently, hybridoma methods have not been developed for a large number of antibodies. This is unfortunate as the potential repertoire of immunoglobulins produced in an immunized animal is quite high, on the order of >10¹⁰, yet hybridoma technology is too complicated and time consuming to adequately screen and develop large number of useful antibodies.

[0004] One approach to this problem has been the development of library screening methods for the isolation of antibodies (Huse et al., Science 246:1275 [1989]; McCafferty et al., Nature 348:552 [1990]). Functional antibody fragments have been produced in E. coli cells (Better et al., Science 240:1041 [1988]; Sastry et al., PNAS 86:5728 [1989]) as “libraries” of recombinant immunoglobulins containing both heavy and light variable domains (Huse et al., supra). The expressed proteins have antigen-binding affinity comparable to the corresponding natural antibodies. However, it is difficult to isolate high binding populations of antibodies from such libraries and where bacterial cells are used to express specific antibodies, isolation and purification procedures are usually complex and time-consuming.

[0005] Combinatorial antibody libraries generated from phage lambda (Huse et al., supra) typically include millions of genes of different antibodies but require complex procedures to screen the library for a selected clone. Methods have been reported for the production of human antibodies using the combinatorial library approach in filamentous bacteriophage. A major disadvantage of such methods is the need to rely on initial isolation of the antibody DNA from peripheral human blood to prepare the library. Moreover, the generation of human antibodies to toxic compounds is not feasible owing to risks involved in immunizing a human with these compounds.

[0006] Currently the most widely used approach for screening polypeptide libraries is to display polypeptides on the surface of filamentous bacteriophage. The polypeptides are expressed as fusions to the N-terminus of a coat protein. As the phage assembles, the fusion proteins are incorporated in the viral coat so that the polypeptides become displayed on the bacteriophage surface. Each polypeptide produced is displayed on the surface of one or more of the bacteriophage particles and subsequently tested for specific ligand interactions. While this approach appears attractive, there are numerous problems, including difficulties of enriching positive clones from phage libraries. Enrichment procedures are based on selective binding and elution onto a solid surface such as an immobilized receptor. Unfortunately, avidity effects arise due to multivalent binding of the phage and the general tendency of phage to contain two or more copies of the displayed polypeptide. The binding to the receptor surface therefore does not depend solely on the strength of interaction between the receptor and the displayed polypeptide. This causes difficulties in the identification of clones with high affinity for the receptor.

[0007] Thus, the art is in need of efficient methods of generating and screening antibody libraries containing large numbers of antibodies.

SUMMARY OF THE INVENTION

[0008] The present invention relates to the production of antibody libraries. In particular, the present invention relates to the use of integrating retroviral vectors to generate libraries comprising a plurality of combinations of antibody light chains and heavy chains.

[0009] For example, in some embodiments, the present invention provides an antibody library comprising at least 10² cells, wherein each cell comprises at least one integrated retroviral vector expressing an antibody light chain. In some embodiments, the antibody library expresses at least 10², preferably at least 10³, even more preferably at least 10⁴, and still more preferably at least 10⁵ unique antibody light chains. In some preferred embodiments, each of the cells comprises exactly one of the integrated retroviral vectors.

[0010] The present invention also provides an antibody library comprising at least 10² cells, wherein each cell comprises at least one integrated retroviral vector expressing an antibody heavy chain. In some embodiments, the antibody library expresses at least 10² preferably at least 10³, even more preferably at least 10⁴, and still more preferably at least 10⁵ unique antibody heavy chains. In some preferred embodiments, each of the cells comprises exactly one of the integrated retroviral vectors.

[0011] The present invention further provides an antibody library comprising at least 10² cells, wherein each cell comprises at least one of a first integrated retroviral vector and at least one of a second integrated retroviral vector, wherein the first retroviral vector expresses an antibody light chain and the second retroviral vector expresses an antibody heavy chain, and wherein the antibody light chain and the antibody heavy chain associate to form an antibody. In some embodiments, the first and second integrated vectors are separately integrated. In some embodiments, the antibody library expresses at least 10², preferably at least 10³, even more preferably at least 10⁴, and still more preferably at least 10⁵ unique antibodies. In some preferred embodiments, the cell comprises exactly one of the first integrated retroviral and exactly one of the second integrated retroviral vector.

[0012] The present invention additionally provides a retroviral particle library comprising at least 10² retroviral particles, wherein each retroviral particle comprises one antibody light chain gene. In some embodiments, the retroviral particle library expresses at least 10², preferably at least 10³, even more preferably at least 10⁴, and still more preferably at least 10⁵ unique antibody light chain genes.

[0013] In other embodiments, the present invention provides a retroviral particle library comprising at least 10² retroviral particles, wherein each retroviral particle comprises one antibody heavy chain gene. In some embodiments, the retroviral particle library expresses at least 10², preferably at least 10³, even more preferably at least 10⁴, and still more preferably at least 10⁵ unique antibody heavy chain genes.

[0014] In still further embodiments, the present invention provides a retroviral particle library comprising at least 10² retroviral particles, wherein each retroviral particle comprises at least one antibody gene selected from the group consisting of antibody heavy chain genes and antibody light chain genes. In some embodiments, the retroviral particle library expresses at least 10², preferably at least 10³, even more preferably at least 10⁴, and still more preferably at least 10⁵ unique antibody genes. In some preferred embodiments, each retroviral particle comprises one antibody heavy chain gene and one antibody light chain gene.

[0015] In yet other embodiments, the present invention provides a plasmid library comprising at least 10² plasmids, wherein each plasmid comprises one antibody heavy chain gene inserted into a retroviral vector backbone. In some embodiments, the plasmid library expresses at least 10², preferably at least 10³, even more preferably at least 10⁴, and still more preferably at least 10⁵ unique antibody heavy chain genes.

[0016] In still additional embodiments, the present invention provides a plasmid library comprising at least 10² plasmids, wherein each plasmid comprises one antibody light chain gene inserted into a retroviral vector backbone. In some embodiments, the plasmid library expresses at least 10², preferably at least 10³, even more preferably at least 10⁴, and still more preferably at least 10⁵ unique antibody light chain genes.

[0017] In certain embodiments, the present invention provides a plasmid library comprising at least 10² plasmids, wherein each plasmid comprises at least one antibody gene selected from the group consisting of antibody heavy chain gene and antibody light chain gene. In some embodiments, the plasmid library expresses at least 10², preferably at least 10³, even more preferably at least 10⁴, and still more preferably at least 10⁵ unique antibody genes. In some preferred embodiments, each plasmid comprises one antibody heavy chain gene and one antibody light chain gene.

[0018] The present invention also provides a method of generating antibody libraries, comprising: providing a plurality of first integratable retroviral particles, wherein each of the plurality of retroviral particles comprises one antibody light chain gene; a plurality of second integratable retroviral particles, wherein each of the plurality of retroviral particles comprises one antibody heavy chain gene; and a host cell comprising a genome; and contacting the plurality of host cell with the plurality of first and second integratable retroviral particles under conditions such that at least one of the plurality of first integratable retroviral particles and at least one of the plurality of second integratable retroviral particles integrate into the genome of the host cell to generate an antibody library. In some embodiments, the plurality of first integratable retroviral particles further comprises a first selectable marker, and the plurality of second integratable retroviral particles further comprises a second selectable marker. In some embodiments, the contacting further comprises selecting for the presence of the first and second selectable markers. In some embodiments, the antibody library comprises at least 10², preferably at least 10³, even more preferably at least 10⁴, and still more preferably at least 10⁵ unique antibodies. In some preferred embodiments, exactly one of the plurality of first integratable retroviral particles and exactly one of the plurality of second integratable retroviral particles integrate into the genome of the host cell. In some embodiments, the method further comprises the step of screening the antibody library. In some embodiments, the screening comprises detecting the ability of antibodies in the antibody library to bind to a pre-selected antigen. In some embodiments, the antibodies are bound to the membrane of the host cell and the detecting comprises fluorescence activated cell sorting. In certain embodiments, the antibodies are secreted and the detecting comprises a solution-based detection assay. In some embodiments, the antibodies are diluted into individual containers prior to said detecting. In some embodiments, the solution based assay is selected from the group consisting of radioimmunoassay, ELISA (enzyme-linked immunosorbant assay), “sandwich” immunoassays, immunoradiometric assays, immunoprecipitation reactions, agglutination assays (e.g., hemagglutination assays, etc.), complement fixation assays, immunofluorescence assays, and protein A assays.

[0019] The present invention further provides a method of screening antibody libraries, comprising: providing an antibody library comprising at least 10² unique antibodies; and a pre-selected antigen; and screening the antibody library, wherein the screening comprises detecting the ability of the at least 10² unique antibodies to bind to the pre-selected antigen. In some embodiments, the antibody library comprises at least 10², preferably at least 10³, even more preferably at least 10⁴, and still more preferably at least 10⁵ unique antibodies. In some embodiments, the antibodies are bound to the membrane of a host cell and the detecting comprises fluorescence activated cell sorting. In certain embodiments, the antibodies are secreted and the detecting comprises a solution-based detection assay. In some embodiments, the antibodies are diluted into individual containers prior to said detecting. In some embodiments, the solution based assay is selected from the group consisting of radioimmunoassay, ELISA (enzyme-linked immunosorbant assay), “sandwich” immunoassays, immunoradiometric assays, immunoprecipitation reactions, agglutination assays (e.g., hemagglutination assays, etc.), complement fixation assays, immunofluorescence assays, and protein A assays.

[0020] The present invention additionally provides a method, comprising providing a plurality of first integratable retroviral particles, wherein each of the plurality of retroviral particles comprises one antibody light chain gene; a plurality of second integratable retroviral particles, wherein each of the plurality of retroviral particles comprises one antibody heavy chain gene; and a host cell comprising a genome; and a pre-selected antigen; and contacting the plurality of host cell with the plurality of first and second integratable retroviral particles under conditions such that at least one of the plurality of first integratable retroviral particles and at least one of the plurality of second integratable retroviral particles integrate into the genome of the host cell to generate an antibody library comprising a plurality of antibodies; and screening the antibody library, wherein the screening comprises detecting the ability of the antibodies to bind to the pre-selected antigen. In some embodiments, the antibody library comprises at least 10², preferably at least 10³, even more preferably at least 10⁴, and still more preferably at least 10⁵ unique antibodies. In some embodiments, the plurality of first integratable retroviral particles further comprises a first selectable marker, and the plurality of second integratable retroviral particles further comprises a second selectable marker. In some embodiments, the contacting further comprises selecting for the presence of the first and second selectable markers. In some embodiments, the antibodies are bound to the membrane of the host cell and the detecting comprises fluorescence activated cell sorting. In some embodiments, the antibodies are secreted and the detecting comprises a solution-based detection assay. In some embodiments, the antibodies are diluted into individual containers prior to said detecting. In some embodiments, the solution based assay is selected from the group consisting of radioimmunoassay, ELISA (enzyme-linked immunosorbant assay), “sandwich” immunoassays, immunoradiometric assays, immunoprecipitation reactions, agglutination assays (e.g., hemagglutination assays, etc.), complement fixation assays, immunofluorescence assays, and protein A assays.

DESCRIPTION OF THE FIGURES

[0021]FIG. 1 shows a plasmid map of pLBC-L2HCF.

[0022]FIG. 2 shows a plasmid map of pLBC-M4HCF.

[0023]FIG. 3 shows a plasmid map of pLNC-L2LC.

[0024]FIG. 4 shows a plasmid map of pLNC-M4LC.

[0025]FIG. 5 shows the nucleic acid sequence of pLBC-L2HCF (SEQ ID NO: 1).

[0026]FIG. 6 shows the nucleic acid sequence of pLBC-M4HCF (SEQ ID NO: 2).

[0027]FIG. 7 shows the nucleic acid sequence of pLNC-L2LC (SEQ ID NO: 3).

[0028]FIG. 8 shows the nucleic acid sequence of pLNC-M4LC (SEQ ID NO: 4).

DEFINITIONS

[0029] To facilitate understanding of the invention, a number of terms are defined below.

[0030] As used herein, the term “host cell” refers to any eukaryotic cell (e.g., mammalian cells, avian cells, amphibian cells, plant cells, fish cells, and insect cells), whether located in vitro or in vivo.

[0031] As used herein, the term “cell culture” refers to any in vitro culture of cells. Included within this term are continuous cell lines (e.g., with an immortal phenotype), primary cell cultures, finite cell lines (e.g., non-transformed cells), and any other cell population maintained in vitro, including oocytes and embryos.

[0032] As used herein, the term “vector” refers to any genetic element, such as a plasmid, phage, transposon, cosmid, chromosome, virus, virion, etc., which is capable of replication when associated with the proper control elements and which can transfer gene sequences between cells. Thus, the term includes cloning and expression vehicles, as well as viral vectors.

[0033] As used herein, the term “integrating vector” refers to a vector whose integration or insertion into a nucleic acid (e.g., a chromosome) is accomplished via an integrase. Examples of “integrating vectors” include, but are not limited to, retroviral vectors, transposons, and adeno associated virus vectors.

[0034] As used herein, the term “integrated” refers to a vector that is stably inserted into the genome (i.e., into a chromosome) of a host cell.

[0035] As used herein, the term “multiplicity of infection” or “MOI” refers to the ratio of integrating vectors:host cells used during transfection or transduction of host cells. For example, if 1,000,000 vectors are used to transduce 100,000 host cells, the multiplicity of infection is 10. The use of this term is not limited to events involving transduction, but instead encompasses introduction of a vector into a host by methods such as lipofection, microinjection, calcium phosphate precipitation, and electroporation.

[0036] As used herein, the term “genome” refers to the genetic material (e.g., chomosomes) of an organism.

[0037] The term “nucleotide sequence of interest” refers to any nucleotide sequence (e.g., RNA or DNA), the manipulation of which may be deemed desirable for any reason (e.g., treat disease, confer improved qualities, expression of a protein of interest in a host cell, expression of a ribozyme, etc.), by one of ordinary skill in the art. Such nucleotide sequences include, but are not limited to, coding sequences of structural genes (e.g., reporter genes, selection marker genes, oncogenes, drug resistance genes, growth factors, etc.), and non-coding regulatory sequences which do not encode an mRNA or protein product (e.g., promoter sequence, polyadenylation sequence, termination sequence, enhancer sequence, etc.).

[0038] As used herein, the term “protein of interest” refers to a protein encoded by a nucleic acid of interest.

[0039] As used herein, the term “signal protein” refers to a protein that is co-expressed with a protein of interest and which, when detected by a suitable assay, provides indirect evidence of expression of the protein of interest. Examples of signal proteins useful in the present invention include, but are not limited to, beta-galactosidase, beta-lactamase, green fluorescent protein, and luciferase.

[0040] As used herein, the term “exogenous gene” refers to a gene that is not naturally present in a host organism or cell, or is artificially introduced into a host organism or cell.

[0041] The term “gene” refers to a nucleic acid (e.g., DNA or RNA) sequence that comprises coding sequences necessary for the production of a polypeptide or precursor (e.g., proinsulin). The polypeptide can be encoded by a full length coding sequence or by any portion of the coding sequence so long as the desired activity or functional properties (e.g., enzymatic activity, ligand binding, signal transduction, etc.) of the full-length or fragment are retained. The term also encompasses the coding region of a structural gene and includes sequences located adjacent to the coding region on both the 5′ and 3′ ends for a distance of about 1 kb or more on either end such that the gene corresponds to the length of the full-length mRNA. The sequences that are located 5′ of the coding region and which are present on the mRNA are referred to as 5′ untranslated sequences. The sequences that are located 3′ or downstream of the coding region and which are present on the mRNA are referred to as 3′ untranslated sequences. The term “gene” encompasses both cDNA and genomic forms of a gene. A genomic form or clone of a gene contains the coding region interrupted with non-coding sequences termed “introns” or “intervening regions” or “intervening sequences.” Introns are segments of a gene that are transcribed into nuclear RNA (hnRNA); introns may contain regulatory elements such as enhancers. Introns are removed or “spliced out” from the nuclear or primary transcript; introns therefore are absent in the messenger RNA (mRNA) transcript. The mRNA functions during translation to specify the sequence or order of amino acids in a nascent polypeptide.

[0042] As used herein, the term “gene expression” refers to the process of converting genetic information encoded in a gene into RNA (e.g., mRNA, rRNA, tRNA, or snRNA) through “transcription” of the gene (i.e., via the enzymatic action of an RNA polymerase), and for protein encoding genes, into protein through “translation” of mRNA. Gene expression can be regulated at many stages in the process. “Up-regulation” or “activation” refers to regulation that increases the production of gene expression products (i.e., RNA or protein), while “down-regulation” or “repression” refers to regulation that decrease production. Molecules (e.g., transcription factors) that are involved in up-regulation or down-regulation are often called “activators” and “repressors,” respectively.

[0043] Where “amino acid sequence” is recited herein to refer to an amino acid sequence of a naturally occurring protein molecule, “amino acid sequence” and like terms, such as “polypeptide” or “protein” are not meant to limit the amino acid sequence to the complete, native amino acid sequence associated with the recited protein molecule.

[0044] As used herein, the terms “nucleic acid molecule encoding,” “DNA sequence encoding,” “DNA encoding,” “RNA sequence encoding,” and “RNA encoding” refer to the order or sequence of deoxyribonucleotides or ribonucleotides along a strand of deoxyribonucleic acid or ribonucleic acid. The order of these deoxyribonucleotides or ribonucleotides determines the order of amino acids along the polypeptide (protein) chain. The DNA or RNA sequence thus codes for the amino acid sequence.

[0045] As used herein, the term “variant,” when used in reference to a protein, refers to proteins encoded by partially homologous nucleic acids so that the amino acid sequence of the proteins varies. As used herein, the term “variant” encompasses proteins encoded by homologous genes having both conservative and nonconservative amino acid substitutions that do not result in a change in protein function, as well as proteins encoded by homologous genes having amino acid substitutions that cause decreased (e.g., null mutations) protein function or increased protein function.

[0046] A gene may produce multiple RNA species that are generated by differential splicing of the primary RNA transcript. cDNAs that are splice variants of the same gene will contain regions of sequence identity or complete homology (representing the presence of the same exon or portion of the same exon on both cDNAs) and regions of complete non-identity (for example, representing the presence of exon “A” on cDNA 1 wherein cDNA 2 contains exon “B” instead). Because the two cDNAs contain regions of sequence identity they will both hybridize to a probe derived from the entire gene or portions of the gene containing sequences found on both cDNAs; the two splice variants are therefore substantially homologous to such a probe and to each other.

[0047] The terms “in operable combination,” “in operable order,” and “operably linked” as used herein refer to the linkage of nucleic acid sequences in such a manner that a nucleic acid molecule capable of directing the transcription of a given gene and/or the synthesis of a desired protein molecule is produced. The term also refers to the linkage of amino acid sequences in such a manner so that a functional protein is produced.

[0048] As used herein, the term “selectable marker” refers to a gene that encodes an enzymatic activity that confers the ability to grow in medium lacking what would otherwise be an essential nutrient (e.g. the HIS3 gene in yeast cells); in addition, a selectable marker may confer resistance to an antibiotic or drug upon the cell in which the selectable marker is expressed. Selectable markers may be “dominant”; a dominant selectable marker encodes an enzymatic activity that can be detected in any eukaryotic cell line. Examples of dominant selectable markers include the bacterial aminoglycoside 3′ phosphotransferase gene (also referred to as the neo gene) that confers resistance to the drug G418 in mammalian cells, the bacterial hygromycin G phosphotransferase (hyg) gene that confers resistance to the antibiotic hygromycin and the bacterial xanthine-guanine phosphoribosyl transferase gene (also referred to as the gpt gene) that confers the ability to grow in the presence of mycophenolic acid. Other selectable markers are not dominant in that their use must be in conjunction with a cell line that lacks the relevant enzyme activity. Examples of non-dominant selectable markers include the thymidine kinase (tk) gene that is used in conjunction with tk⁻ cell lines, the CAD gene which is used in conjunction with CAD-deficient cells and the mammalian hypoxanthine-guanine phosphoribosyl transferase (hprt) gene which is used in conjunction with hprt⁻ cell lines. A review of the use of selectable markers in mammalian cell lines is provided in Sambrook, J. et al., Molecular Cloning: A Laboratory Manual, 2nd ed., Cold Spring Harbor Laboratory Press, New York (1989) pp.16.9-16.15.

[0049] As used herein, the term “selecting for the presence of said first and second selectable markers” refers to culturing cells transducted with a retrovirus comprising a selectable marker under conditions that require the presence of the selectable marker in order for growth (e.g., culturing cells in the presence of a particular nutrient, antibiotic or drug).

[0050] As used herein, the term “regulatory element” refers to a genetic element that controls some aspect of the expression of nucleic acid sequences. For example, a promoter is a regulatory element that facilitates the initiation of transcription of an operably linked coding region. Other regulatory elements are splicing signals, polyadenylation signals, termination signals, RNA export elements, internal ribosome entry sites, etc. (defined infra).

[0051] Transcriptional control signals in eukaryotes comprise “promoter” and “enhancer” elements. Promoters and enhancers consist of short arrays of DNA sequences that interact specifically with cellular proteins involved in transcription (Maniatis et al., Science 236:1237 [1987]). Promoter and enhancer elements have been isolated from a variety of eukaryotic sources including genes in yeast, insect and mammalian cells, and viruses (analogous control elements, i.e., promoters, are also found in prokaryotes). The selection of a particular promoter and enhancer depends on what cell type is to be used to express the protein of interest. Some eukaryotic promoters and enhancers have a broad host range while others are functional in a limited subset of cell types (for review see, Voss et al., Trends Biochem. Sci., 11:287 [1986]; and Maniatis et al., supra). For example, the SV40 early gene enhancer is very active in a wide variety of cell types from many mammalian species and has been widely used for the expression of proteins in mammalian cells (Dijkema et al., EMBO J. 4:761 [1985]). Two other examples of promoter/enhancer elements active in a broad range of mammalian cell types are those from the human elongation factor 1α gene (Uetsuki et al., J. Biol. Chem., 264:5791 [1989]; Kim et al., Gene 91:217 [1990]; and Mizushima and Nagata, Nuc. Acids. Res., 18:5322 [1990]) and the long terminal repeats of the Rous sarcoma virus (Gorman et al., Proc. Natl. Acad. Sci. USA 79:6777 [1982]) and the human cytomegalovirus (Boshart et al., Cell 41:521 [1985]).

[0052] As used herein, the term “promoter/enhancer” denotes a segment of DNA which contains sequences capable of providing both promoter and enhancer functions (i.e., the functions provided by a promoter element and an enhancer element, see above for a discussion of these functions). For example, the long terminal repeats of retroviruses contain both promoter and enhancer functions. The enhancer/promoter may be “endogenous” or “exogenous” or “heterologous.” An “endogenous” enhancer/promoter is one that is naturally linked with a given gene in the genome. An “exogenous” or “heterologous” enhancer/promoter is one that is placed in juxtaposition to a gene by means of genetic manipulation (i.e., molecular biological techniques such as cloning and recombination) such that transcription of that gene is directed by the linked enhancer/promoter.

[0053] Regulatory elements may be tissue specific or cell specific. The term “tissue specific” as it applies to a regulatory element refers to a regulatory element that is capable of directing selective expression of a nucleotide sequence of interest to a specific type of tissue (e.g., liver) in the relative absence of expression of the same nucleotide sequence of interest in a different type of tissue (e.g., lung).

[0054] Tissue specificity of a regulatory element may be evaluated by, for example, operably linking a reporter gene to a promoter sequence (which is not tissue-specific) and to the regulatory element to generate a reporter construct, introducing the reporter construct into the genome of an animal such that the reporter construct is integrated into every tissue of the resulting transgenic animal, and detecting the expression of the reporter gene (e.g., detecting mRNA, protein, or the activity of a protein encoded by the reporter gene) in different tissues of the transgenic animal. The detection of a greater level of expression of the reporter gene in one or more tissues relative to the level of expression of the reporter gene in other tissues shows that the regulatory element is “specific” for the tissues in which greater levels of expression are detected. Thus, the term “tissue-specific” (e.g., liver-specific) as used herein is a relative term that does not require absolute specificity of expression. In other words, the term “tissue-specific” does not require that one tissue have extremely high levels of expression and another tissue have no expression. It is sufficient that expression is greater in one tissue than another. By contrast, “strict” or “absolute” tissue-specific expression is meant to indicate expression in a single tissue type (e.g., liver) with no detectable expression in other tissues.

[0055] The term “cell type specific” as applied to a regulatory element refers to a regulatory element that is capable of directing selective expression of a nucleotide sequence of interest in a specific type of cell in the relative absence of expression of the same nucleotide sequence of interest in a different type of cell within the same tissue. The term “cell type specific” when applied to a regulatory element also means a regulatory element capable of promoting selective expression of a nucleotide sequence of interest in a region within a single tissue.

[0056] Cell type specificity of a regulatory element may be assessed using methods well known in the art (e.g., immunohistochemical staining and/or Northern blot analysis). Briefly, for immunohistochemical staining, tissue sections are embedded in paraffin, and paraffin sections are reacted with a primary antibody specific for the polypeptide product encoded by the nucleotide sequence of interest whose expression is regulated by the regulatory element. A labeled (e.g., peroxidase conjugated) secondary antibody specific for the primary antibody is allowed to bind to the sectioned tissue and specific binding detected (e.g., with avidin/biotin) by microscopy. Briefly, for Northern blot analysis, RNA is isolated from cells and electrophoresed on agarose gels to fractionate the RNA according to size followed by transfer of the RNA from the gel to a solid support (e.g., nitrocellulose or a nylon membrane). The immobilized RNA is then probed with a labeled oligo-deoxyribonucleotide probe or DNA probe to detect RNA species complementary to the probe used. Northern blots are a standard tool of molecular biologists.

[0057] The term “promoter,” “promoter element,” or “promoter sequence” as used herein, refers to a DNA sequence which when ligated to a nucleotide sequence of interest is capable of controlling the transcription of the nucleotide sequence of interest into mRNA. A promoter is typically, though not necessarily, located 5′ (i.e., upstream) of a nucleotide sequence of interest whose transcription into mRNA it controls, and provides a site for specific binding by RNA polymerase and other transcription factors for initiation of transcription.

[0058] Promoters may be constitutive or regulatable. The term “constitutive” when made in reference to a promoter means that the promoter is capable of directing transcription of an operably linked nucleic acid sequence in the absence of a stimulus (e.g., heat shock, chemicals, etc.). In contrast, a “regulatable” promoter is one that is capable of directing a level of transcription of an operably linked nucleic acid sequence in the presence of a stimulus (e.g., heat shock, chemicals, etc.) that is different from the level of transcription of the operably linked nucleic acid sequence in the absence of the stimulus.

[0059] The presence of “splicing signals” on an expression vector often results in higher levels of expression of the recombinant transcript. Splicing signals mediate the removal of introns from the primary RNA transcript and consist of a splice donor and acceptor site (Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd ed., Cold Spring Harbor Laboratory Press, New York [1989], pp. 16.7-16.8). A commonly used splice donor and acceptor site is the splice junction from the 16S RNA of SV40.

[0060] Efficient expression of recombinant DNA sequences in eukaryotic cells requires expression of signals directing the efficient termination and polyadenylation of the resulting transcript. Transcription termination signals are generally found downstream of the polyadenylation signal and are a few hundred nucleotides in length. The term “poly A site” or “poly A sequence” as used herein denotes a DNA sequence that directs both the termination and polyadenylation of the nascent RNA transcript. Efficient polyadenylation of the recombinant transcript is desirable as transcripts lacking a poly A tail are unstable and are rapidly degraded. The poly A signal utilized in an expression vector may be “heterologous” or “endogenous.” An endogenous poly A signal is one that is found naturally at the 3′ end of the coding region of a given gene in the genome. A heterologous poly A signal is one that is isolated from one gene and placed 3′ of another gene. A commonly used heterologous poly A signal is the SV40 poly A signal. The SV40 poly A signal is contained on a 237 bp BamHI/BclI restriction fragment and directs both termination and polyadenylation (Sambrook, supra, at 16.6-16.7).

[0061] Eukaryotic expression vectors may also contain “viral replicons” or “viral origins of replication.” Viral replicons are viral DNA sequences that allow for the extrachromosomal replication of a vector in a host cell expressing the appropriate replication factors. Vectors that contain either the SV40 or polyoma virus origin of replication replicate to high “copy number” (up to 10⁴ copies/cell) in cells that express the appropriate viral T antigen. Vectors that contain the replicons from bovine papillomavirus or Epstein-Barr virus replicate extrachromosomally at “low copy number” (˜100 copies/cell). However, it is not intended that expression vectors be limited to any particular viral origin of replication.

[0062] As used herein, the term “long terminal repeat” of “LTR” refers to transcriptional control elements located in or isolated from the U3 region 5′ and 3′ of a retroviral genome. As is known in the art, long terminal repeats may be used as control elements in retroviral vectors, or isolated from the retroviral genome and used to control expression from other types of vectors.

[0063] As used herein, the terms “RNA export element” or “Pre-mRNA Processing Enhancer (PPE)” refer to 3′ and 5′ cis-acting post-transcriptional regulatory elements that enhance export of RNA from the nucleus. “PPE” elements include, but are not limited to Mertz sequences (described in U.S. Pat. Nos. 5,914,267 and 5,686,120, all of which are incorporated herein by reference) and woodchuck mRNA processing enhancer (WPRE; WO99/14310 and U.S. Pat. No. 6,136,597, each of which is incorporated herein by reference).

[0064] As used herein, the term “polycistronic” refers to an mRNA encoding more than polypeptide chain (See, e.g., WO 93/03143, WO 88/05486, and European Pat. No. 1,170,58, all of which are incorporated herein by reference). Likewise, the term “arranged in polycistronic sequence” refers to the arrangement of genes encoding two different polypeptide chains in a single mRNA.

[0065] As used herein, the term “internal ribosome entry site” or “IRES” refers to a sequence located between polycistronic genes that permits the production of the expression product originating from the second gene by internal initiation of the translation of the dicistronic mRNA. Examples of internal ribosome entry sites include, but are not limited to, those derived from foot and mouth disease virus (FDV), encephalomyocarditis virus, poliovirus and RDV (Scheper et al., Biochem. 76: 801-809 [1994]; Meyer et al., J. Virol. 69: 2819-2824 [1995]; Jang et al., 1988, J. Virol. 62: 2636-2643 [1998]; Haller et al., J. Virol. 66: 5075-5086 [1995]). Vectors incorporating IRES's may be assembled as is known in the art. For example, a retroviral vector containing a polycistronic sequence may contain the following elements in operable association: nucleotide polylinker, gene of interest, an internal ribosome entry site and a mammalian selectable marker or another gene of interest. The polycistronic cassette is situated within the retroviral vector between the 5′ LTR and the 3′ LTR at a position such that transcription from the 5′ LTR promoter transcribes the polycistronic message cassette. The transcription of the polycistronic message cassette may also be driven by an internal promoter (e.g., cytomegalovirus promoter) or an inducible promoter, which may be preferable depending on the use. The polycistronic message cassette can further comprise a cDNA or genomic DNA (gDNA) sequence operatively associated within the polylinker. Any mammalian selectable marker can be utilized as the polycistronic message cassette mammalian selectable marker. Such mammalian selectable markers are well known to those of skill in the art and can include, but are not limited to, kanamycin/G418, hygromycin B or mycophenolic acid resistance markers.

[0066] As used herein, the term “retrovirus” refers to a retroviral particle which is capable of entering a cell (i.e., the particle contains a membrane-associated protein such as an envelope protein or a viral G glycoprotein which can bind to the host cell surface and facilitate entry of the viral particle into the cytoplasm of the host cell) and integrating the retroviral genome (as a double-stranded provirus) into the genome of the host cell. The term “retrovirus” encompasses Oncovirinae (e.g., Moloney murine leukemia virus (MoMOLV), Moloney murine sarcoma virus (MoMSV), and Mouse mammary tumor virus (MMTV), Spumavirinae, and Lentivirinae (e.g., Human immunodeficiency virus, Simian immunodeficiency virus, Equine infection anemia virus, and Caprine arthritis-encephalitis virus; See, e.g., U.S. Pat. Nos. 5,994,136 and 6,013,516, both of which are incorporated herein by reference).

[0067] As used herein, the term “retroviral vector” refers to a retrovirus that has been modified to express a gene of interest. Retroviral vectors can be used to transfer genes efficiently into host cells by exploiting the viral infectious process. Foreign or heterologous genes cloned (i.e., inserted using molecular biological techniques) into the retroviral genome can be delivered efficiently to host cells that are susceptible to infection by the retrovirus. Through well known genetic manipulations, the replicative capacity of the retroviral genome can be destroyed. The resulting replication-defective vectors can be used to introduce new genetic material to a cell but they are unable to replicate. A helper virus or packaging cell line can be used to permit vector particle assembly and egress from the cell. Such retroviral vectors comprise a replication-deficient retroviral genome containing a nucleic acid sequence encoding at least one gene of interest (i.e., a polycistronic nucleic acid sequence can encode more than one gene of interest), a 5′ retroviral long terminal repeat (5′ LTR); and a 3′ retroviral long terminal repeat (3′ LTR).

[0068] The term “pseudotyped retroviral vector” refers to a retroviral vector containing a heterologous membrane protein. The term “membrane-associated protein” refers to a protein (e.g., a viral envelope glycoprotein or the G proteins of viruses in the Rhabdoviridae family such as VSV, Piry, Chandipura and Mokola), which is associated with the membrane surrounding a viral particle; these membrane-associated proteins mediate the entry of the viral particle into the host cell. The membrane associated protein may bind to specific cell surface protein receptors, as is the case for retroviral envelope proteins or the membrane-associated protein may interact with a phospholipid component of the plasma membrane of the host cell, as is the case for the G proteins derived from members of the Rhabdoviridae family.

[0069] As used herein, the term “retroviral particle” refers to infections viral particles generated by packaging a retroviral vector in a packaging cell line (See e.g., Example 3).

[0070] As used herein, the term “retroviral particle library” refers to a plurality of retroviral particles comprising a plurality of unique antibody genes (e.g., heavy or light chain genes). In preferred embodiments, retroviral particle libraries comprise at least 10², more preferably, at least 10³, even more preferably at least 10⁴, and still further more preferably, at least 10⁵ unique heavy and/or light chain genes.

[0071] As used herein, the term “plasmid” refers to a circular, extra-chromosomal nucleic acid molecule capable of autonomous replication in a host cell. In preferred embodiments, plasmids of the present invention further comprise retroviral LTRs and one or more heavy and/or light chain genes inserted between the retroviral LTRs.

[0072] As used herein, the term “plasmid library” refers to a plurality of plasmids comprising a plurality of unique antibody genes (e.g., heavy or light chain genes) inserted between retroviral LTRs. In preferred embodiments, retroviral particle libraries comprise at least 10², more preferably, at least 10³ even more preferably at least 10⁴, and still further more preferably, at least 10⁵ unique heavy and/or light chain genes.

[0073] The term “heterologous membrane-associated protein” refers to a membrane-associated protein that is derived from a virus that is not a member of the same viral class or family as that from which the nucleocapsid protein of the vector particle is derived. “Viral class or family” refers to the taxonomic rank of class or family, as assigned by the International Committee on Taxonomy of Viruses.

[0074] The term “Rhabdoviridae” refers to a family of enveloped RNA viruses that infect animals, including humans, and plants. The Rhabdoviridae family encompasses the genus Vesiculovirus that includes vesicular stomatitis virus (VSV), Cocal virus, Piry virus, Chandipura virus, and Spring viremia of carp virus (sequences encoding the Spring viremia of carp virus are available under GenBank accession number U18101). The G proteins of viruses in the Vesiculovirus genera are virally-encoded integral membrane proteins that form externally projecting homotrimeric spike glycoproteins complexes that are required for receptor binding and membrane fusion. The G proteins of viruses in the Vesiculovirus genera have a covalently bound palmititic acid (C₁₆) moiety. The amino acid sequences of the G proteins from the Vesiculoviruses are fairly well conserved. For example, the Piry virus G proteins share about 38% identity and about 55% similarity with the VSV G proteins (several strains of VSV are known, e.g., Indiana, New Jersey, Orsay, San Juan, etc., and their G proteins are highly homologous). The Chandipura virus G protein and the VSV G proteins share about 37% identity and 52% similarity. Given the high degree of conservation (amino acid sequence) and the related functional characteristics (e.g., binding of the virus to the host cell and fusion of membranes, including syncytia formation) of the G proteins of the Vesiculoviruses, the G proteins from non-VSV Vesiculoviruses may be used in place of the VSV G protein for the pseudotyping of viral particles. The G proteins of the Lyssa viruses (another genera within the Rhabdoviridae family) also share a fair degree of conservation with the VSV G proteins and function in a similar manner (e.g., mediate fusion of membranes) and therefore may be used in place of the VSV G protein for the pseudotyping of viral particles. The Lyssa viruses include the Mokola virus and the Rabies viruses (several strains of Rabies virus are known and their G proteins have been cloned and sequenced). The Mokola virus G protein shares stretches of homology (particularly over the extracellular and transmembrane domains) with the VSV G proteins, which show about 31% identity, and 48% similarity with the VSV G proteins. Preferred G proteins share at least 25% identity, preferably at least 30% identity and most preferably at least 35% identity with the VSV G proteins. The VSV G protein from which New Jersey strain (the sequence of this G protein is provided in GenBank accession numbers M27165 and M21557) is employed as the reference VSV G protein.

[0075] As used herein, the term “lentivirus vector” refers to retroviral vectors derived from the Lentiviridae family (e.g., human immunodeficiency virus, simian immunodeficiency virus, equine infectious anemia virus, and caprine arthritis-encephalitis virus) that are capable of integrating into non-dividing cells (See, e.g., U.S. Pat. Nos. 5,994,136 and 6,013,516, both of which are incorporated herein by reference).

[0076] The term “pseudotyped lentivirus vector” refers to lentivirus vector containing a heterologous membrane protein (e.g., a viral envelope glycoprotein or the G proteins of viruses in the Rhabdoviridae family such as VSV, Piry, Chandipura and Mokola).

[0077] As used herein the term, the term “in vitro” refers to an artificial environment and to processes or reactions that occur within an artificial environment. In vitro environments can consist of, but are not limited to, test tubes and cell cultures. The term “in vivo” refers to the natural environment (e.g., an animal or a cell) and to processes or reactions that occur within a natural environment.

[0078] As used herein, the term “immunoglobulin” refers to proteins that bind a specific antigen. Immunoglobulins include, but are not limited to, polyclonal, monoclonal, chimeric, and humanized antibodies, Fab fragments, F(ab′)₂ fragments, and includes immunoglobulins of the following classes: IgG, IgA, IgM, IgD, IbE, and secreted immunoglobulins (sIg). Immunoglobulins generally comprise two identical heavy chains and two light chains.

[0079] As used herein, the term “antigen binding protein” refers to proteins that bind to a specific antigen. “Antigen binding proteins” include, but are not limited to, immunoglobulins, including polyclonal, monoclonal, chimeric, and humanized antibodies; Fab fragments, F(ab′)₂ fragments, and Fab expression libraries; and single chain antibodies.

[0080] As used herein, the term “antibody library” refers to a plurality of antibodies comprising a plurality of unique immunoglobulins or antibody chains (e.g., heavy or light chains). In preferred embodiments, antibody libraries comprise at least 10², more preferably, at least 10³, even more preferably at least 104, and still more preferably, at least 10⁵ unique antibodies or antibody chains.

[0081] As used herein, the term “pre-selected antigen” refers to a known antigen for which it is desired to identify an “antigen binding protein” or antibody that specifically binds the pre-selected antigen. Such antigen binding proteins or antibodies can be identified by “screening said antibody library.” As used herein, the term “screening said antibody library” refers to the process of identifying antibodies within a antibody library that specifically bind to a pre-selected antigen. Screening may be carried out using any suitable method that is able to identify specific interactions between the pre-selected antigen and antibodies, including but not limited to, those screening methods disclosed herein. Preferably, screening is carried out in a high-throughput manner.

[0082] As used herein, the term “solution based detection assay” when used in the context of “screening said antibody library” refers to an assay for detecting the binding of antibodies to a pre-selected antigen that is conducted in solution (e.g., an aqueous solution). Examples of solution based detection assays include, but are not limited to, radioimmunoassay, ELISA (enzyme-linked immunosorbant assay), “sandwich” immunoassays, immunoradiometric assays, immunoprecipitation reactions, agglutination assays (e.g., hemagglutination assays, etc.), complement fixation assays, immunofluorescence assays, and protein A assays.

[0083] As used herein, the term “reporter gene” refers to a gene encoding a protein that may be assayed. Examples of reporter genes include, but are not limited to, luciferase (See, e.g., deWet et al., Mol. Cell. Biol. 7:725 [1987] and U.S. Pat. Nos., 6,074,859; 5,976,796; 5,674,713; and 5,618,682; all of which are incorporated herein by reference), green fluorescent protein (e.g., GenBank Accession Number U43284; a number of GFP variants are commercially available from CLONTECH Laboratories, Palo Alto, Calif.), chloramphenicol acetyltransferase, β-galactosidase, alkaline phosphatase, and horse radish peroxidase.

[0084] As used herein, the term “purified” refers to molecules, either nucleic or amino acid sequences that are removed from their natural environment, isolated or separated. An “isolated nucleic acid sequence” is therefore a purified nucleic acid sequence. “Substantially purified” molecules are at least 60% free, preferably at least 75% free, and more preferably at least 90% free from other components with which they are naturally associated.

[0085] The term “test compound” refers to any chemical entity, pharmaceutical, drug, and the like contemplated to be useful in the treatment and/or prevention of a disease, illness, sickness, or disorder of bodily function, or otherwise alter the physiological or cellular status of a sample. Test compounds comprise both known and potential therapeutic compounds. A test compound can be determined to be therapeutic by screening using the screening methods of the present invention. A “known therapeutic compound” refers to a therapeutic compound that has been shown (e.g., through animal trials or prior experience with administration to humans) to be effective in such treatment or prevention.

DETAILED DESCRIPTION OF THE INVENTION

[0086] The present invention relates to the production of proteins in host cells, and more particularly to the production of antibody libraries. The present invention utilizes integrating retroviral vectors to create cell lines containing a library of unique antibody heavy and/or light chains. The antibody libraries of the present invention have the further advantage of strict control over MOI (e.g., only one antibody heavy chain and one antibody light chain per cell).

I. Vectors and Methods for Transfection

[0087] According to the present invention, antibody libraries are generated using integrating retroviral vectors comprising antibody heavy and/or light chain genes. The design, production, and use of these vectors in the present invention is described below.

[0088] A. Retroviral Vectors

[0089] Retroviruses (family Retroviridae) are divided into three groups: the spumaviruses (e.g., human foamy virus); the lentiviruses (e.g., human immunodeficiency virus and sheep visna virus) and the oncoviruses (e.g., MLV, Rous sarcoma virus).

[0090] Retroviruses are enveloped (i.e., surrounded by a host cell-derived lipid bilayer membrane) single-stranded RNA viruses that infect animal cells. When a retrovirus infects a cell, its RNA genome is converted into a double-stranded linear DNA form (i.e., it is reverse transcribed). The DNA form of the virus is then integrated into the host cell genome as a provirus. The provirus serves as a template for the production of additional viral genomes and viral mRNAs. Mature viral particles containing two copies of genomic RNA bud from the surface of the infected cell. The viral particle comprises the genomic RNA, reverse transcriptase and other pol gene products inside the viral capsid (which contains the viral gag gene products), which is surrounded by a lipid bilayer membrane derived from the host cell containing the viral envelope glycoproteins (also referred to as membrane-associated proteins).

[0091] The organization of the genomes of numerous retroviruses is well known to the art and this has allowed the adaptation of the retroviral genome to produce retroviral vectors. The production of a recombinant retroviral vector carrying antibody heavy or light chain genes of interest is typically achieved in two stages.

[0092] First, the antibody heavy or light chain genes of interest is inserted into a retroviral vector which contains the sequences necessary for the efficient expression of the antibody heavy or light chain genes of interest (including promoter and/or enhancer elements which may be provided by the viral long terminal repeats (LTRs) or by an internal promoter/enhancer and relevant splicing signals), sequences required for the efficient packaging of the viral RNA into infectious virions (e.g., the packaging signal (Psi), the tRNA primer binding site (−PBS), the 3′ regulatory sequences required for reverse transcription (+PBS)) and the viral LTRs. The LTRs contain sequences required for the association of viral genomic RNA, reverse transcriptase and integrase functions, and sequences involved in directing the expression of the genomic RNA to be packaged in viral particles. For safety reasons, many recombinant retroviral vectors lack functional copies of the genes that are essential for viral replication (these essential genes are either deleted or disabled); therefore, the resulting virus is said to be replication defective.

[0093] Second, following the construction of the recombinant vector, the vector DNA is introduced into a packaging cell line. Packaging cell lines provide proteins required in trans for the packaging of the viral genomic RNA into viral particles having the desired host range (i.e., the viral-encoded gag, pol and env proteins). The host range is controlled, in part, by the type of envelope gene product expressed on the surface of the viral particle. Packaging cell lines may express ecotrophic, amphotropic or xenotropic envelope gene products. Alternatively, the packaging cell line may lack sequences encoding a viral envelope (env) protein. In this case the packaging cell line will package the viral genome into particles that lack a membrane-associated protein (e.g., an env protein). In order to produce viral particles containing a membrane associated protein that will permit entry of the virus into a cell, the packaging cell line containing the retroviral sequences is transfected with sequences encoding a membrane-associated protein (e.g., the G protein of vesicular stomatitis virus (VSV)). The transfected packaging cell will then produce viral particles that contain the membrane-associated protein expressed by the transfected packaging cell line; these viral particles, which contain viral genomic RNA derived from one virus encapsidated by the envelope proteins of another virus, are said to be pseudotyped virus particles.

[0094] The retroviral vectors utilized in the methods and compositions of the present invention can be further modified to include additional regulatory sequences. As described above, the retroviral vectors of the present invention include the following elements in operable association: a) a 5′ LTR; b) a packaging signal; c) a 3′ LTR and d) a nucleic acid encoding a protein of interest located between the 5′ and 3′ LTRs. In some embodiments of the present invention, the nucleic acid of interest may be arranged in opposite orientation to the 5′ LTR when transcription from an internal promoter is desired. Suitable internal promoters include, but are not limited to, the alpha-lactalbumin promoter, the CMV promoter (human or ape), and the thymidine kinase promoter.

[0095] In other embodiments of the present invention, where secretion of the antibody heavy or light chains of interest is desired, the vectors are modified by including a signal peptide sequence in operable association with the protein of interest. The sequences of several suitable signal peptides are known to those in the art, including, but not limited to, those derived from tissue plasminogen activator, human growth hormone, lactoferrin, alpha-casein, and alpha-lactalbumin. In other embodiments, the native signal peptide sequence of the heavy and/or light chain gene is utilized.

[0096] In other embodiments of the present invention, the vectors are modified by incorporating an RNA export element (See, e.g., U.S. Pat. Nos. 5,914,267; 6,136,597; and 5,686,120; and WO99/14310, all of which are incorporated herein by reference) either 3′ or 5′ to the nucleic acid sequence encoding the protein of interest. It is contemplated that the use of RNA export elements allows high levels of expression of the antibody heavy or light chains of interest without incorporating splice signals or introns in the nucleic acid sequence encoding the antibody heavy or light chains of interest.

[0097] In still other embodiments, the vector further comprises at least one internal ribosome entry site (IRES) sequence. The sequences of several suitable IRES's are available, including, but not limited to, those derived from foot and mouth disease virus (FDV), encephalomyocarditis virus, and poliovirus. The IRES sequence can be interposed between two transcriptional units (e.g., nucleic acids encoding different proteins of interest or subunits of a multisubunit protein such as an antibody) to form a polycistronic sequence so that the two transcriptional units are transcribed from the same promoter.

[0098] The retroviral vectors of the present invention may also further comprise a selectable marker allowing selection of transformed cells. A number of selectable markers find use in the present invention, including, but not limited to the bacterial aminoglycoside 3′ phosphotransferase gene (also referred to as the neo gene) that confers resistance to the drug G418 in mammalian cells, the bacterial hygromycin G phosphotransferase (hyg) gene that confers resistance to the antibiotic hygromycin and the bacterial xanthine-guanine phosphoribosyl transferase gene (also referred to as the gpt gene) that confers the ability to grow in the presence of mycophenolic acid.

[0099] In still other embodiments of the present invention, the retroviral vectors may comprise recombination elements recognized by a recombination system (e.g., the cre/loxP or flp recombinase systems, see, e.g., Hoess et al., Nucleic Acids Res. 14:2287-2300 [1986], O'Gorman et al., Science 251:1351-55 [1991], van Deursen et al., Proc. Natl. Acad. Sci. USA 92:7376-80 [1995], and U.S. Pat. No. 6,025,192, herein incorporated by reference). After integration of the vectors into the genome of the host cell, the host cell can be transiently transfected (e.g., by electroporation, lipofection, or microinjection) with either a recombinase enzyme (e.g., Cre recombinase) or a nucleic acid sequence encoding the recombinase enzyme and one or more nucleic acid sequences encoding antibody heavy or light chains of interest flanked by sequences recognized by the recombination enzyme so that the nucleic acid sequence is inserted into the integrated vector.

[0100] Viral vectors, including recombinant retroviral vectors, provide a more efficient means of transferring genes into cells as compared to other techniques such as calcium phosphate-DNA co-precipitation or DEAE-dextran-mediated transfection, electroporation or microinjection of nucleic acids. It is believed that the efficiency of viral transfer is due in part to the fact that the transfer of nucleic acid is a receptor-mediated process (i.e., the virus binds to a specific receptor protein on the surface of the cell to be infected). In addition, the virally transferred nucleic acid once inside a cell integrates in controlled manner in contrast to the integration of nucleic acids which are not virally transferred; nucleic acids transferred by other means such as calcium phosphate-DNA co-precipitation are subject to rearrangement and degradation.

[0101] The most commonly used recombinant retroviral vectors are derived from the amphotropic Moloney murine leukemia virus (MoMLV) (See e.g., Miller and Baltimore Mol. Cell. Biol. 6:2895 [1986]). The MoMLV system has several advantages: 1) this specific retrovirus can infect many different cell types, 2) established packaging cell lines are available for the production of recombinant MoMLV viral particles and 3) the transferred genes are permanently integrated into the target cell chromosome. The established MoMLV vector systems comprise a DNA vector containing a small portion of the retroviral sequence (e.g., the viral long terminal repeat or “LTR” and the packaging or “psi” signal) and a packaging cell line. The antibody heavy or light chain genes to be transferred are inserted into the DNA vector. The viral sequences present on the DNA vector provide the signals necessary for the insertion or packaging of the vector RNA into the viral particle and for the expression of the inserted gene. The packaging cell line provides the proteins required for particle assembly (Markowitz et al., J. Virol. 62:1120 [1988]).

[0102] Despite these advantages, existing retroviral vectors based upon MoMLV are limited by several intrinsic problems: 1) they do not infect non-dividing cells (Miller et al., Mol. Cell. Biol. 10:4239 [1990]), except, perhaps, oocytes; 2) they produce low titers of the recombinant virus (Miller and Rosman, BioTechniques 7: 980 [1980] and Miller, Nature 357: 455 [1990]); and 3) they infect certain cell types (e.g., human lymphocytes) with low efficiency (Adams et al., Proc. Natl. Acad. Sci. USA 89:8981 [1992]). The low titers associated with MoMLV-based vectors have been attributed, at least in part, to the instability of the virus-encoded envelope protein. Concentration of retrovirus stocks by physical means (e.g., ultracentrifugation and ultrafiltration) leads to a severe loss of infectious virus.

[0103] The low titer and inefficient infection of certain cell types by MoMLV-based vectors has been overcome by the use of pseudotyped retroviral vectors that contain the G protein of VSV as the membrane associated protein. Unlike retroviral envelope proteins, which bind to a specific cell surface protein receptor to gain entry into a cell, the VSV G protein interacts with a phospholipid component of the plasma membrane (Mastromarino et al., J. Gen. Virol. 68:2359 [1977]). Because entry of VSV into a cell is not dependent upon the presence of specific protein receptors, VSV has an extremely broad host range. Pseudotyped retroviral vectors bearing the VSV G protein have an altered host range characteristic of VSV (i.e., they can infect almost all species of vertebrate, invertebrate and insect cells). Importantly, VSV G-pseudotyped retroviral vectors can be concentrated 2000-fold or more by ultracentrifugation without significant loss of infectivity (Bums et al. Proc. Natl. Acad. Sci. USA 90:8033 [1993]).

[0104] The present invention is not limited to the use of the VSV G protein when a viral G protein is employed as the heterologous membrane-associated protein within a viral particle (See, e.g., U.S. Pat. No. 5,512,421, which is incorporated herein by reference). The G proteins of viruses in the Vesiculovirus genera other than VSV, such as the Piry and Chandipura viruses, that are highly homologous to the VSV G protein and, like the VSV G protein, contain covalently linked palmitic acid (Brun et al. Intervirol. 38:274 [1995] and Masters et al., Virol. 171:285 (1990]). Thus, the G protein of the Piry and Chandipura viruses can be used in place of the VSV G protein for the pseudotyping of viral particles. In addition, the VSV G proteins of viruses within the Lyssa virus genera such as Rabies and Mokola viruses show a high degree of conservation (amino acid sequence as well as functional conservation) with the VSV G proteins. For example, the Mokola virus G protein has been shown to function in a manner similar to the VSV G protein (i.e., to mediate membrane fusion) and therefore may be used in place of the VSV G protein for the pseudotyping of viral particles (Mebatsion et al., J. Virol. 69:1444 [1995]). Viral particles may be pseudotyped using either the Piry, Chandipura or Mokola G protein as described in Example 2, with the exception that a plasmid containing sequences encoding either the Piry, Chandipura or Mokola G protein under the transcriptional control of a suitable promoter element (e.g., the CMV intermediate-early promoter; numerous expression vectors containing the CMV IE promoter are available, such as the pcDNA3.1 vectors (Invitrogen)) is used in place of pHCMV-G. Sequences encoding other G proteins derived from other members of the Rhabdoviridae family may be used; sequences encoding numerous rhabdoviral G proteins are available from the GenBank database.

[0105] The majority of retroviruses can transfer or integrate a double-stranded linear form of the virus (the provirus) into the genome of the recipient cell only if the recipient cell is cycling (i.e., dividing) at the time of infection. Retroviruses that have been shown to infect dividing cells exclusively, or more efficiently, include MLV, spleen necrosis virus, Rous sarcoma virus and human immunodeficiency virus (HIV; while HIV infects dividing cells more efficiently, HIV can infect non-dividing cells).

[0106] It has been shown that the integration of MLV virus DNA depends upon the host cell's progression through mitosis and it has been postulated that the dependence upon mitosis reflects a requirement for the breakdown of the nuclear envelope in order for the viral integration complex to gain entry into the nucleus (Roe et al., EMBO J. 12:2099 [1993]). However, as integration does not occur in cells arrested in metaphase, the breakdown of the nuclear envelope alone may not be sufficient to permit viral integration; there may be additional requirements such as the state of condensation of the genomic DNA (Roe et al., supra).

[0107] The present invention also contemplates the use of lentiviral vectors to generate high copy number cell lines. The lentiviruses (e.g., equine infectious anemia virus, caprine arthritis-encephalitis virus, human immunodeficiency virus) are a subfamily of retroviruses that are able to integrate into non-dividing cells. The lentiviral genome and the proviral DNA have the three genes found in all retroviruses: gag, pol, and env, which are flanked by two LTR sequences. The gag gene encodes the internal structural proteins (e.g., matrix, capsid, and nucleocapsid proteins); the pol gene encodes the reverse transcriptase, protease, and integrase proteins; and the pol gene encodes the viral envelope glycoproteins. The 5′ and 3′ LTRs control transcription and polyadenylation of the viral RNAs. Additional genes in the lentiviral genome include the vif, vpr, tat, rev, vpu, nef, and vpx genes.

[0108] A variety of lentiviral vectors and packaging cell lines are known in the art and find use in the present invention (See, e.g., U.S. Pat. Nos. 5,994,136 and 6,013,516, both of which are herein incorporated by reference). Furthermore, the VSV G protein has also been used to pseudotype retroviral vectors based upon the human immunodeficiency virus (HIV) (Naldini et al., Science 272:263 [1996]). Thus, the VSV G protein may be used to generate a variety of pseudotyped retroviral vectors and is not limited to vectors based on MoMLV. The lentiviral vectors may also be modified as described above to contain various regulatory sequences (e.g., signal peptide sequences, RNA export elements, and IRES's). After the lentiviral vectors are produced, they may be used to transfect host cells as described below for retroviral vectors.

II. Use of Host Cells To Produce Antibodies

[0109] In some preferred embodiments, the methods of the present invention are used to generate antibody libraries from immunoglobulin heavy and light chain genes. In some embodiments, the host cells express more than one exogenous protein. For example, the host cells may be transfected with vectors encoding different proteins of interest (e.g., cotransfection with one vector encoding a first protein of interest (e.g., immunoglobulin light chain) and a second vector encoding a second protein of interest (e.g., immunoglobulin heavy chain) or serial transfection or infection) so that the host cell contains at least one integrated copy of a first vector encoding a first antibody heavy or light chain of interest and at least one integrated copy of second integrating vector encoding a second antibody heavy or light chain of interest.

[0110] A. Antibody Genes

[0111] The present invention is not limited to the use of particular antibody genes. In some embodiments, antibody heavy and/or light chain genes are obtained commercially. Commercially available libraries included, but are not limited to, those available from Cambridge Antibody Technology (Cambridgeshire, United Kingdom), HUCAL libraries (See e.g., U.S. Pat. No. 5,514,548, herein incorporated by reference) available from Morphosys (Munich, Germany), Bioinvent (Lund, Sweden), and INTRACEL (Rockville, Md.). In other embodiments, antibody heavy and light chain genes are obtained by PCR (e.g., including but not limited to, the method disclosed in U.S. Pat. No. 6,291,650, herein incorporated by reference).

[0112] B. Generation of Antibody Libraries

[0113] In some embodiments, greater than one (e.g., two or more, preferably five or more, and more preferably, 10 or more) heavy and light chains are used to generate antibody libraries using retroviral vectors. In some embodiments, antibody genes are first cloned into GATEWAY (Invitrogen, Carlsbad, Calif.) entry vectors. In preferred embodiments, heavy chain antibody sequences (one gene per vector) are cloned into vectors comprising a first selectable marker and light chain antibody sequences are cloned into vectors comprising a second selectable marker.

[0114] In some embodiments, antibody genes are next transferred into retroviral vectors containing GATEWAY recombination sequences inserted in between retroviral LTR sequences (See e.g., the above description of retroviral vectors). In some embodiments, each retroviral vector contains either a heavy chain or a light chain antibody gene, as well as one of two selectable markers. In other embodiments, the retroviral vectors contain one heavy chain gene and one light chain gene separated by an IRES sequence.

[0115] In some embodiments, following transfer of antibody genes into retroviral vectors, the vectors are packaged in packaging cell line (e.g., 293 GP cells) to generate retroviral particles. Retroviral particles may be generated using any suitable method, including but not limited to, those described below. In some embodiments, each retroviral particle contains one antibody gene (e.g., either a heavy or a light chain gene). In other embodiments, each vector contains one heavy chain gene and one light chain gene separated by an IRES.

[0116] In some embodiments, retroviral particles are next used to transduce host cells (e.g., mammalian cells). Host cells may be transduced and cultured using any suitable method, including but not limited to, those described below. In preferred embodiments, prior to transduction, the viral titer is determined and the correct amount of virus necessary to obtain the desired MOI of infection is used. For example, if retroviral particles containing a single antibody heavy or light chain gene are utilized, a MOI of two is desired. In such embodiments, host cells are first transduced with virus containing either a heavy or light chain gene and grown under condition to select the associated selectable marker. Next, the host cells are transduced again with the other antibody gene and the second selectable marker is selected for, thus resulting in host cells comprising one heavy chain gene and one light chain gene. In other embodiments, both heavy chain containing and light chain containing retroviral particles are simultaneously used to transduce host cells, followed by selection for both markers.

[0117] In yet other embodiments, retroviral particles containing both heavy and light chain antibody genes are used to transduce host cells at a MOI of 1, followed by selection for both markers.

[0118] C. Screening Antibody Libraries

[0119] The present invention contemplates the use of cell lines for screening compounds for activity, and in particular to high throughput screening of compounds from combinatorial libraries (e.g., antibody libraries containing greater than 10² unique antibodies or antibody heavy or light chains). The antibody libraries of the present invention can be screened using a variety of screening methods. In preferred embodiments, antibody libraries are screened for their ability to bind to a pre-selected antigen.

[0120] In some embodiments, antibodies are expressed on the cell surface of host cells as membrane bound antibodies (See e.g., U.S. Pat. Nos. 6,214,613 and 5,298,420, each of which is herein incorporated by reference). Membrane bound antibodies may be screened for antigen binding by any suitable method, including but not limited to, flow cytometry.

[0121] Flow cytometry objectively quantifies and separates single cells on the basis of one or more parameters (e.g., binding to a pre-selected antigen). Flow cytometry involves channeling individual cells in a narrow fluid stream past a laser beam, which is usually oriented at a right angle to the flow. Optical sensors detect signals generated as the cells pass through the laser beam. The cells scatter the laser light in proportion to their size and “complexity” (e.g. presence of granules in their cytoplasm). Thus, cells can be identified based on their light scatter characteristics, and a population chosen (gated) for further analysis.

[0122] In some embodiments, pre-selected antigens coupled to fluorochromes (different fluorochromes emit different wavelengths of light upon excitation by a laser) are used to label or “stain” the cells so that each cell can be identified and quantitated based upon its fluorescence signal. In other embodiments, secondary antibodies that specifically bind to the pre-selected antigen are coupled to fluorochromes and used for detection. A computer collects the fluorescence signature of each cell and displays the pattern of fluorescence for the user to analyze. In other applications, where one might want to separate cells which have a certain staining pattern from all other cells (e.g., due to binding to a labeled pre-selected antigen), the flow cytometry machine can direct those desired cells into a tube provided by the user. This is called fluorescence activated cell sorting (FACS).

[0123] In other embodiments, antibodies generated by the methods of the present invention are secreted into medium (e.g., using the methods described in Example 3). For example, in some embodiments, antibodies are secreted in 96 well plates. Each well of the plate can then be diluted, for example to 100 cells per well. The plates can be screened for binding to a pre-selected antigen using any suitable method. Any immunoassay that tests for binding specificity familiar to the skilled artisan may be used in this step and subsequent steps involving measures of binding with cells, including but not limited to, radioimmunoassay, ELISA (enzyme-linked immunosorbant assay), “sandwich” immunoassays, immunoradiometric assays, immunoprecipitation reactions, agglutination assays (e.g., hemagglutination assays, etc.), complement fixation assays, immunofluorescence assays, and protein A assays. Wells giving a positive signal can then be further diluted to contain 1-10 antibody producing cells. These plates can then be further screened in order to identify the antibody producing cell(s) with the desired binding properties. The desired cells can be used to generate stable cell lines (e.g., using the methods described in Example 3).

[0124] The present invention is not limited to the screening methods disclosed herein. One skilled in the art recognizes that any suitable method may be utilized that results in the identification of antibodies with the desired properties (e.g., antigen binding).

III. Generation of Host Cells Comprising Integrated Retroviral Vectors

[0125] The present invention further provides methods of generating host cells comprising integrated retroviral vectors comprising antibody heavy or light chain genes.

[0126] A. Transfection of Integrating Vectors

[0127] Once integrating vectors (e.g., retroviral vectors) encoding an antibody heavy or light chains of interest have been produced, they may be used to transfect or transduce host cells (examples of which are described below). Preferably, host cells are transfected or transduced with integrating vectors at a multiplicity of infection sufficient to result in the integration of the desired number of vectors (e.g., one or two). When non-pseudotyped retroviral vectors are utilized for infection, the host cells are incubated with the culture medium from the retroviral producing cells containing the desired titer (i.e., colony forming units, CFUs) of infectious vectors. When pseudotyped retroviral vectors are utilized, the vectors are concentrated to the appropriate titer by ultracentrifugation and then added to the host cell culture. Alternatively, the concentrated vectors can be diluted in a culture medium appropriate for the cell type.

[0128] In each case, the host cells are exposed to medium containing the infectious retroviral vectors for a sufficient period of time to allow infection and subsequent integration of the vectors. In general, the amount of medium used to overlay the cells should be kept to as small a volume as possible so as to encourage the maximum amount of integration events per cell. As a general guideline, the number of colony forming units (cfu) per milliliter should be about 10⁵ to 10⁷ cfu/ml, depending upon the number of integration events desired. The host cells (See below description of host cells) are then cultured (e.g., according to the methods described below).

[0129] B. Host Cells

[0130] The present invention contemplates the transfection of a variety of host cells with retroviral vectors in order to generate the antibody libraries of the present invention. A number of mammalian host cell lines are known in the art. In general, these host cells are capable of growth and survival when placed in either monolayer culture or in suspension culture in a medium containing the appropriate nutrients and growth factors, as is described in more detail below. Typically, the cells are capable of expressing and secreting large quantities of a particular antibody heavy or light chains of interest into the culture medium. Examples of suitable mammalian host cells include, but are not limited to Chinese hamster ovary cells (CHO-K1, ATCC CC1-61); bovine mammary epithelial cells (ATCC CRL 10274; bovine mammary epithelial cells); monkey kidney CV1 line transformed by SV40 (COS-7, ATCC CRL 1651); human embryonic kidney line (293 or 293 cells subcloned for growth in suspension culture; see, e.g., Graham et al., J. Gen Virol., 36:59 [1977]); baby hamster kidney cells (BHK, ATCC CCL 10); mouse sertoli cells (TM4, Mather, Biol. Reprod. 23:243-251 [1980]); monkey kidney cells (CV1 ATCC CCL 70); African green monkey kidney cells (VERO-76, ATCC CRL-1587); human cervical carcinoma cells (HELA, ATCC CCL 2); canine kidney cells (MDCK, ATCC CCL 34); buffalo rat liver cells (BRL 3A, ATCC CRL 1442); human lung cells (W138, ATCC CCL 75); human liver cells (Hep G2, HB 8065); mouse mammary tumor (MMT 060562, ATCC CCL51); TRI cells (Mather et al., Annals N.Y. Acad. Sci., 383:44-68 [1982]); MRC 5 cells; FS4 cells; rat fibroblasts (208F cells); MDBK cells (bovine kidney cells); and a human hepatoma line (Hep G2).

[0131] The present invention also contemplates the use of amphibian and insect host cell lines. Examples of suitable insect host cell lines include, but are not limited to, mosquito cell lines (e.g., ATCC CRL-1660). Examples of suitable amphibian host cell lines include, but are not limited to, toad cell lines (e.g., ATCC CCL-102).

[0132] C. Host Cell Culture

[0133] The transfected host cells are cultured according to methods known in the art. Suitable culture conditions for mammalian cells are well known in the art (See e.g., J. Immunol. Methods (1983)56:221-234 [1983], Animal Cell Culture: A Practical Approach 2nd Ed., Rickwood, D. and Hames, B. D., eds. Oxford University Press, New York [1992]).

[0134] The host cell cultures of the present invention are prepared in a media suitable for the particular cell being cultured. Commercially available media such as Ham's F10 (Sigma, St. Louis, Mo.), Minimal Essential Medium (MEM, Sigma), RPMI-1640 (Sigma), and Dulbecco's Modified Eagle's Medium (DMEM, Sigma) are exemplary nutrient solutions. Suitable media are also described in U.S. Pat. Nos. 4,767,704; 4,657,866; 4,927,762; 5,122,469; 4,560,655; and WO 90/03430 and WO 87/00195; the disclosures of which are herein incorporated by reference. Any of these media may be supplemented as necessary with serum, hormones and/or other growth factors (such as insulin, transferrin, or epidermal growth factor), salts (such as sodium chloride, calcium, magnesium, and phosphate), buffers (such as HEPES), nucleosides (such as adenosine and thymidine), antibiotics (such as gentamycin (gentamicin), trace elements (defined as inorganic compounds usually present at final concentrations in the micromolar range) lipids (such as linoleic or other fatty acids) and their suitable carriers, and glucose or an equivalent energy source. Any other necessary supplements may also be included at appropriate concentrations that would be known to those skilled in the art. For mammalian cell culture, the osmolality of the culture medium is generally about 290-330 mOsm.

[0135] The present invention also contemplates the use of a variety of culture systems (e.g., petri dishes, 96 well plates, roller bottles, and bioreactors) for the transfected host cells. For example, the transfected host cells can be cultured in a perfusion system. Perfusion culture refers to providing a continuous flow of culture medium through a culture maintained at high cell density. The cells are suspended and do not require a solid support to grow on. Generally, fresh nutrients must be supplied continuously with concomitant removal of toxic metabolites and, ideally, selective removal of dead cells. Filtering, entrapment and micro-capsulation methods are all suitable for refreshing the culture environment at sufficient rates.

[0136] As another example, in some embodiments a fed batch culture procedure can be employed. In the preferred fed batch culture the mammalian host, cells and culture medium are supplied to a culturing vessel initially and additional culture nutrients are fed, continuously or in discrete increments, to the culture during culturing, with or without periodic cell and/or product harvest before termination of culture. The fed batch culture can include, for example, a semi-continuous fed batch culture, wherein periodically whole culture (including cells and medium) is removed and replaced by fresh medium. Fed batch culture is distinguished from simple batch culture in which all components for cell culturing (including the cells and all culture nutrients) are supplied to the culturing vessel at the start of the culturing process. Fed batch culture can be further distinguished from perfusion culturing insofar as the supernate is not removed from the culturing vessel during the process (in perfusion culturing, the cells are restrained in the culture by, e.g., filtration, encapsulation, anchoring to microcarriers etc. and the culture medium is continuously or intermittently introduced and removed from the culturing vessel). In some particularly preferred embodiments, the batch cultures are performed in roller bottles.

[0137] Further, the cells of the culture may be propagated according to any scheme or routine that may be suitable for the particular host cell and the particular production plan contemplated. Therefore, the present invention contemplates a single step or multiple step culture procedure. In a single step culture the host cells are inoculated into a culture environment and the processes of the instant invention are employed during a single production phase of the cell culture. Alternatively, a multi-stage culture is envisioned. In the multi-stage culture cells may be cultivated in a number of steps or phases. For instance, cells may be grown in a first step or growth phase culture wherein cells, possibly removed from storage, are inoculated into a medium suitable for promoting growth and high viability. The cells may be maintained in the growth phase for a suitable period of time by the addition of fresh medium to the host cell culture.

[0138] Fed batch or continuous cell culture conditions are devised to enhance growth of the mammalian cells in the growth phase of the cell culture. In the growth phase cells are grown under conditions and for a period of time that is maximized for growth. Culture conditions, such as temperature, pH, dissolved oxygen (dO₂) and the like, are those used with the particular host and will be apparent to the ordinarily skilled artisan. Generally, the pH is adjusted to a level between about 6.5 and 7.5 using either an acid (e.g., CO₂) or a base (e.g., Na₂CO₃ or NaOH). A suitable temperature range for culturing mammalian cells such as CHO cells is between about 30° to 38° C. and a suitable dO₂ is between 5-90% of air saturation.

[0139] In some embodiments, following the antibody heavy and/or light chain production phase, the antibody heavy and/or light chains of interest are recovered from the culture medium using techniques that are well established in the art. In some embodiments, the heavy and/or light chains preferably recovered from the culture medium as secreted polypeptides (e.g., the secretion of the heavy and/or light chain of interest is directed by a signal peptide sequence), although it also may be recovered from host cell lysates. As a first step, the culture medium or lysate is centrifuged to remove particulate cell debris. The polypeptide thereafter is purified from contaminant soluble proteins and polypeptides, with the following procedures being exemplary of suitable purification procedures: by fractionation on immunoaffinity or ion-exchange columns; ethanol precipitation; reverse phase HPLC; chromatography on silica or on a cation-exchange resin such as DEAE; chromatofocusing; SDS-PAGE; ammonium sulfate precipitation; gel filtration using, for example, Sephadex G-75; and protein A Sepharose columns to remove contaminants such as IgG. A protease inhibitor such as phenyl methyl sulfonyl fluoride (PMSF) also may be useful to inhibit proteolytic degradation during purification. Additionally, the protein of interest can be fused in frame to a marker sequence, which allows for purification of the protein of interest. Non-limiting examples of marker sequences include a hexahistidine tag that may be supplied by a vector, preferably a pQE-9 vector, and a hemagglutinin (HA) tag. The HA tag corresponds to an epitope derived from the influenza hemagglutinin protein (See e.g., Wilson et al., Cell, 37:767 [1984]). One skilled in the art will appreciate that purification methods suitable for the polypeptide of interest may require modification to account for changes in the character of the polypeptide upon expression in recombinant cell culture.

EXPERIMENTAL

[0140] The following examples serve to illustrate certain preferred embodiments and aspects of the present invention and are not to be construed as limiting the scope thereof.

[0141] In the experimental disclosure which follows, the following abbreviations apply: M (molar); mM (millimolar); μM (micromolar); nM (nanomolar); mol (moles); mmol (millimoles); μmol (micromoles); nmol (nanomoles); gm (grams); mg (milligrams); μg (micrograms); pg (picograms); L (liters); ml (milliliters); μl (microliters); cm (centimeters); mm (millimeters); μm (micrometers); nm (nanometers); ° C. (degrees Centigrade); AMP (adenosine 5′-monophosphate); BSA (bovine serum albumin); CDNA (copy or complimentary DNA); CS (calf serum); DNA (deoxyribonucleic acid); ssDNA (single stranded DNA); dsDNA (double stranded DNA); dNTP (deoxyribonucleotide triphosphate); LH (luteinizing hormone); NIH (National Institutes of Health, Besthesda, Md.); RNA (ribonucleic acid); PBS (phosphate buffered saline); g (gravity); OD (optical density); HEPES (N-[2-Hydroxyethyl]piperazine-N-[2-ethanesulfonic acid]); HBS (HEPES buffered saline); PBS (phosphate buffered saline); SDS (sodium dodecylsulfate); Tris-HCl (tris[Hydroxymethyl]aminomethane-hydrochloride); Klenow (DNA polymerase I large (Klenow) fragment); rpm (revolutions per minute); EGTA (ethylene glycol-bis(β-aminoethyl ether) N,N,N′,N′-tetraacetic acid); EDTA (ethylenediaminetetracetic acid); bla (β-lactamase or ampicillin-resistance gene); ORI (plasmid origin of replication); lacI (lac repressor); X-gal (5-bromo-4-chloro-3-indolyl-β-D-galactoside); ATCC (American Type Culture Collection, Rockville, Md.); GIBCO/BRL (GIBCO/BRL, Grand Island, N.Y.); Perkin-Elmer (Perkin-Elmer, Norwalk, Conn.); and Sigma (Sigma Chemical Company, St. Louis, Mo.).

Example 1 Construction of Retroviral Vectors Containing Antibody Genes

[0142] This Example describes the cloning of the heavy chains of MN14 and LL2 antibodies into a Gateway vector (Invitrogen, Calif.) incorporating one selectable marker, and the light chains of MN14 and LL2 antibodies into a second Gateway vector with a second selectable marker. Co-transfection into retroviral vectors with both vector “libraries” and selection for both markers allows for the formation of antibodies with all possible heavy chain/light chain combinations.

[0143] A. Vector Construction

[0144] The GATEWAY (Invitrogen, Carslbad, Calif.) system is a cloning system based on site-specific recombination. Sequences of interest are cloned into a first GATEWAY vector (referred to as an entry clone). The sequences of interest can then be transferred to destination vectors (e.g., those containing retroviral LTRs) containing compatible recombination sequences through site-specific recombination.

[0145] Retroviral vectors were constructed containing the light and heavy chains form MN14 and LL2 antibodies. First, both the light chain genes and the heavy chain genes were cloned into the GATEWAY entry vector pENTR11 with the NcoI sites upstream of the 5′ EcorRI site removed.

[0146] For heavy chain genes, the destination vector used was pLBCG-S, which contains the retroviral LTR sequences flanking GATEWAY recombination sequences and a Blastocidin selectable marker. The splicing site removed versions of both the MN14 and LL2 heavy chain genes were recombined from pENTR11-M4HCF or pENTR11-L2HCF into the pLBCG-S plasmid to give pLBC-L2HCF (SEQ ID NO: 1) and pLBC-M4HCF (SEQ ID NO: 2) (See FIGS. 1 and 2).

[0147] The light chains were recombined into the Gateway version of pLNC to give plasmids pLNC-L2LC (SEQ ID NO: 3) and pLNC-M4LC (SEQ ID NO: 4) (See FIGS. 3 and 4).

[0148] A test of recombining different ratios of the light chain constructs into the expression vector (pLNC-G) was performed. Clones from the “library” were screened to determine the number of clones that can be obtained from a reaction, the frequency of clones without inserts and representation of the clones. Three recombination reactions were performed using different ratios of the light chain constructs (1:1 LL2 LC:MN14 LC, 1:4 and 4:1) in the expression vector (pLNC-G). All three reactions gave ˜5000 clones from transforming 2 μl of a 22 μl recombination reaction. 150 ng Entry DNA to 300 ng of Destination vector was used. 15 clones from each of the reactions were screened by mini-preps. All clones had either the MN14 or LL2 light chain insert; there were no clones without inserts. Of 15 clones screened from each of the 3 reactions, the products showed ˜1:1 (10 LL2:8 MN14), ˜4:1 (11 LL2:4 MN14) and ˜1:4 (4 LL2:11 MN14)

[0149] B. Library Construction

[0150] The construction of the library was performed in two steps. 1) Creation of a light chain (LL2 and MN14) library in the 293 cell line and 2) The ‘superinfection’ of this cell line with the heavy chains from LL2 and MN14. The light chain construction led to a vector initial titer of 1.6×10⁵. The heavy chain initial titer was 4.3×10⁴. The double infected cells were maintained in two selection plates, one containing blasticidin (HC marker) the other containing blasticidin and neomycin. Both cultures grew well.

[0151] Single colonies were made and from an initial 50 clones, 38 viable clones were obtained. The supernatants of these clones were analyzed for human kappa light chain, human Fc (IgG) and CEA antigen binding. For the huLC and huFc assay, the plates were coated with anti-huFab, for the CEA binding assay, the plates were coated with CEA antigen. A summary of the results is shown in Table 1. All values are in ng/ml. Purified MN14 was used as a standard. TABLE 1 MN14 Clone CEA LC HC RATIOS # Activity Activity Activity HC:LC HC:MN14 MN14:LC 1 211.4 106.6 433.0 4.06 2.05 1.98 2 198.7 680.6 3.43 5 39.9 189.5 4.75 6 4.8 287.6 1519.2 5.28 7 2.7 242.3 958.5 3.96 9 170.8 586.0 3.43 10 255.0 774.5 3.04 11 128.9 340.9 2.64 12 42.3 151.3 3.58 13 20.0 310.4 1482.9 4.78 14 0.6 220.1 825.4 3.75 15 1698.7 378.4 3496.1 9.24 2.06 4.49 16 287.2 1047.5 3.65 17 73.0 18 117.7 341.5 2.90 20 1375.5 361.2 2922.7 8.09 2.12 3.81 21 357.9 2824.6 7.89 22 239.3 209.1 658.2 3.15 2.75 1.14 23 2.5 277.7 1038.2 3.74 24 0.6 380.8 2136.8 5.61 25 322.5 1499.4 4.65 28 211.8 545.0 2.57 29 1383.0 348.1 2500.3 7.18 1.81 3.97 31 2213.5 357.7 3420.2 9.56 1.55 6.19 32 362.8 3256.2 8.97 33 1287.4 272.5 1329.2 4.88 1.03 4.72 34 297.7 1559.7 5.24 37 208.5 488.0 2.34 39 1953.8 347.0 2724.7 7.85 1.39 5.63 40 327.7 2762.6 8.43 41 294.3 2235.6 7.60 42 230.3 1127.2 4.89 43 1791.8 292.3 2793.4 9.56 1.56 6.13 44 682.2 257.3 1146.7 4.46 1.68 2.65 45 319.4 3475.8 10.88 46 345.0 265.8 1880.8 7.08 5.45 1.30 47 1281.5 241.1 1816.4 7.53 1.42 5.32 48 152.6 600.9 3.94

[0152] All of the 38 clones made anti-huFc reactive components—this can be assembled IgG or heavy chain. 12 out of 38 clones made CEA reactive immunoglobulin in a range between 200-2200 ng/ml. 25 out of 38 Fe reactive clones did not react with the CEA antigen. 37 out of 38 clones produced kappa light chain. The ratio of Fe reative:LC reactive components in the SNs is highest in clones producing high levels of Fe reactive components. All MN14 reactive clones are also reactive with the anti Fe antibody

Example 2 Generation of Cell Lines Stably Expressing the MoMLV Gag and Pol Proteins

[0153] Example 1 describes the production of retroviral vectors containing antibody genes. These methods are generally applicable to the production of the vectors described above. The expression of the fusogenic VSV G protein on the surface of cells results in syncytium formation and cell death. Therefore, in order to produce retroviral particles containing the VSV G protein as the membrane-associated protein a two-step approach was taken. First, stable cell lines expressing the gag and pol proteins from MoMLV at high levels were generated (e.g., 293GP^(SD) cells). The stable cell line, which expresses the gag and pot proteins, produces noninfectious viral particles lacking a membrane-associated protein (e.g., an envelope protein). The stable cell line was then co-transfected, using the calcium phosphate precipitation, with VSV-G and gene of interest plasmid DNAs. The pseudotyped vector generated was used to infect 293GP^(SD) cells to produce stably transformed cell lines. Stable cell lines can be transiently transfected with a plasmid capable of directing the high level expression of the VSV G protein (see below). The transiently transfected cells produce VSV G-pseudotyped retroviral vectors that can be collected from the cells over a period of 3 to 4 days before the producing cells die as a result of syncytium formation.

[0154] The first step in the production of VSV G-pseudotyped retroviral vectors, the generation of stable cell lines expressing the MoMLV gag and pol proteins is described below. The human adenovirus Ad-5-transformed embryonal kidney cell line 293 (ATCC CRL 1573) was cotransfected with the pCMVgag-pol and the gene encoding for phleomycin. pCMV gag-pol contains the MoMLV gag and pol genes under the control of the CMV promoter (pCMV gag-pol is available from the ATCC).

[0155] The plasmid DNA was introduced into the 293 cells using calcium phosphate co-precipitation (Graham and Van der Eb, Virol. 52:456 [1973]). Approximately 5×10⁵ 293 cells were plated into a 100 mm tissue culture plate the day before the DNA co-precipitate was added. Stable transformants were selected by growth in DMEM-high glucose medium containing 10% FCS and 10 μg/ml phleomycin (selective medium). Colonies that grew in the selective medium were screened for extracellular reverse transcriptase activity (Goff et al, J. Virol. 38:239 [1981]) and intracellular p30gag expression. The presence of p30gag expression was determined by Western blotting using a goat-anti p30 antibody (NCI antiserum 77S000087). A clone that exhibited stable expression of the retroviral genes was selected. This clone was named 293GP^(SD) (293 gag-pol-San Diego). The 293GP^(SD) cell line, a derivative of the human Ad-5-transformed embryonal kidney cell line 293, was grown in DMEM-high glucose medium containing 10% FCS.

Example 3 Preparation of Pseudotyped Retroviral Vectors Bearing the G Glycoprotein of VSV

[0156] In order to produce VSV G protein pseudotyped retrovirus the following steps were taken. The 293GP^(SD) cell line was co-transfected with VSV-G plasmid and DNA plasmid of interest. This co-transfection generates the infectious particles used to infect 293GP^(SD) cells to generate the packaging cell lines. This Example describes the production of pseudotyped LNBOTDC virus. This general method may be used to produce any of the vectors described in Example 1.

[0157] a) Cell Lines and Plasmids

[0158] The packaging cell line, 293GP^(SD) was grown in alpha-MEM-high glucose medium containing 10% FCS. The titer of the pseudo-typed virus may be determined using either 208F cells (Quade, Virol. 98:461 [1979]) or NIH/3T3 cells (ATCC CRL 1658); 208F and NIH/3T3 cells are grown in DMEM-high glucose medium containing 10% CS.

[0159] The plasmids utilized were pLBC-L2HCF, pLBC-M4HCF, pLNC-L2LC and pLNC-M4L (See Example 1). The plasmid pHCMV-G contains the VSV G gene under the transcriptional control of the human cytomegalovirus intermediate-early promoter (Yee et al, Meth. Cell Biol. 43:99 [1994]).

[0160] b) Production of Stable Packaging Cell Lines, Pseudotyped Vector and Titering of Pseudotyped Vector

[0161] DNA (SEQ ID NOs: 1, 2, 3, or 4) was co-transfected with pHCMV-G DNA into the packaging line 293GP^(SD) to produce virus. The resulting virus was then used to infect 293GP^(SD) cells to transform the cells. The procedure for producing pseudotyped virus was carried out as described (Yee et al., Meth. Cell Biol. 43:99 [1994].

[0162] This is a retroviral gene construct that upon creation of infectious replication defective retroviral vector will cause the insertion of the sequence described above into the cells of interest. The 3′ viral LTR provides the poly-adenylation sequence for the mRNA.

[0163] Briefly, on day 1, approximately 7×10⁷ 293GP^(SD) cells were placed in a 75 cm² tissue culture flask. The flasks were incubated overnight at 37° C., 5.0% CO₂.

[0164] On the following day (day 2), the media in the 293GP^(SD) flasks were changed with harvest medium 2 hours prior to transfection. 293GP^(SD) cells were then co-transfected with 25 μg of plasmid DNA and 25 μg of VSV-G plasmid DNA using the standard calcium phosphate co-precipitation procedure (Graham and Van der Eb, Virol. 52:456 [1973]). Briefly, pHCMV-G DNA, construct DNA, 1:10 TE, and 2M CaCl₂ were combined and mixed. A range of 10 to 40 μg of plasmid DNA was used. 2×HBS (37° C.) was placed into a separate tube. While bubbling air through the 2×HBS, the DNA/1:10 TE/2M CaCl₂ mixture was added drop wise. The transfection mixture was allowed to incubate at room temperature for 20 minutes. Following the incubation period, the correct amount of transfection mixture was added to each culture vessel. The plates or flasks were returned to 37° C., 5% CO₂ incubator for approximately six hours. Following the incubation period, the transfections were checked for the presence of crystals/precipitate by viewing under an inverted scope. The transfection media was then removed from culture vessels by aspiration with a sterile Pasteur pipet and vacuum pump and fresh harvest medium was added to each culture vessel. The culture vessels were incubated at 37° C., 5% CO₂ for 24-72 hr.

[0165] On day 3, approximately 7.5×10⁵ 293GP^(SD) cells were placed in a 25 cm² tissue culture flask 24 hours prior to the harvest of the pseudotyped virus from the transfected 293GP^(SD) cells. On day 4, culture medium was harvested from the transfected 293GP^(SD) cells 48 hours after the application of the plasmid DNA with the gene of interest and VSV-G DNA. The culture medium was filtered through a 0.45 μm filter. The culture medium containing LNBOTDC virus was used to infect the 293GP^(SD) cells as follows. The culture medium was removed from the 293GP^(SD) cells and was replaced with the virus-containing culture medium. Polybrene was added to the medium at a final concentration of 8 μg/ml. The virus-containing medium was allowed to remain on the 293GP^(SD) cells for 24 hours. Following the 16 hour infection period (on day 5), the medium was removed from the 293GP^(SD) cells and was replaced with fresh medium containing 400 μg/ml G418 (GIBCO/BRL). The medium was changed approximately every 3 days until only those colonies that are G418-resistant colonies remain.

[0166] The G418-resistant 293GP^(SD) colonies were plated as single cells in 96 wells. Sixty to one hundred G418-resistant colonies were screened for the expression of the BOTDC antibody in order to identify high producing clones. The top 10 clones in 96-well plates were transferred into 6-well plates and allowed to grow to confluency.

[0167] The top 10 clones were then expanded to screen for high titer production. Based on protein expression and titer production, 5 clonal cell lines were selected. One line was designated the master cell bank and the other 4 as backup cell lines. Pseudotyped vector was generated as follows. Approximately 7×10⁷ 293GP^(SD)/cells were placed into a 75 cm² tissue culture flask. Twenty-four hours later, the cells were transfected with 25 μg of pHCMV-G plasmid DNA using calcium phosphate co-precipitation. Six to eight hours after the calcium-DNA precipitate was applied to the cells, the DNA solution was replaced with fresh culture medium (lacking G418). Longer transfection times (overnight) were found to result in the detachment of the majority of the ₂₉₃GP^(SD)/cells from the plate and are therefore avoided. The transfected 293GP^(SD)/cells produce pseudotyped virus.

[0168] The pseudotyped virus generated from the transfected 293GP^(SD) cells can be collected at least once a day between 24 and 96 hr after transfection. The highest virus titer was generated approximately 48 to 72 hr after initial pHCMV-G transfection. While syncytium formation became visible about 48 hr after transfection in the majority of the transfected cells, the cells continued to generate pseudotyped virus for at least an additional 48 hr as long as the cells remained attached to the tissue culture plate. The collected culture medium containing the VSV G-pseudotyped virus was pooled, filtered through a 0.45 μm filter and stored at −80° C. or concentrated immediately and then stored at −80° C.

[0169] The titer of the VSV G-pseudotyped virus was then determined as follows. Approximately 5×10⁵ rat 208F fibroblasts cells were plated into 6 well plates. Twenty-fours hours after plating, the cells were infected with serial dilutions of the virus-containing culture medium in the presence of 8 μg/ml polybrene. Twenty four hours after infection with virus, the medium was replaced with fresh medium containing 400 μg/ml G418 and selection was continued for 14 days until only G418-resistant colonies remain. Viral titers were typically about 0.5 to 5.0×10⁶ colony forming units (cfu)/ml. The titer of the virus stock could be concentrated to a titer of greater than 10⁹ cfu/ml as described below.

Example 4 Concentration of Pseudotyped Retroviral Vectors

[0170] The VSV G-pseudotyped viruses were concentrated to a high titer by one cycle of ultracentrifugation. However, in certain embodiments, two cycles are performed for further concentration. The culture medium collected and filtered as described in Example 2 which contained pseudotyped virus was transferred to Oakridge centrifuge tubes (50 ml Oakridge tubes with sealing caps, Nalge Nunc International) previously sterilized by autoclaving. The virus was sedimented in a JA20 rotor (Beckman) at 48,000×g (20,000 rpm) at 4° C. for 120 min. The culture medium was then removed from the tubes in a biosafety hood and the media remaining in the tubes was aspirated to remove the supernatant. The virus pellet was resuspended to 0.5 to 1% of the original volume in 0.1×HBSS. The resuspended virus pellet was incubated overnight at 4° C. without swirling. The virus pellet could be dispersed with gentle pipetting after the overnight incubation without significant loss of infectious virus. The titer of the virus stock was routinely increased 100- to 300-fold after one round of ultracentrifugation. The efficiency of recovery of infectious virus varied between 30 and 100%.

[0171] The virus stock was then subjected to low speed centrifugation in a microfuge for 5 min at 4° C. to remove any visible cell debris or aggregated virions that were not resuspended under the above conditions. It was noted that if the virus stock is not to be used for injection into oocytes or embryos, this centrifugation step may be omitted.

[0172] In some embodiments, the virus stock is subjected to another round of ultracentrifugation to further concentrate the virus stock. The resuspended virus from the first round of centrifugation is pooled and pelleted by a second round of ultracentrifugation that is performed as described above. Viral titers are increased approximately 2000-fold after the second round of ultracentrifugation (titers of the pseudotyped LNBOTDC virus are typically greater than or equal to 1×10⁹ cfu/ml after the second round of ultracentrifugation).

[0173] The titers of the pre- and post-centrifugation fluids were determined by infection of 208F cells (NIH 3T3 or bovine mammary epithelial cells can also be employed) followed by selection of G418-resistant colonies as described above in Example 2.

[0174] Amplification of retroviral sequences in co-cultures may result in the generation of replication competent retroviruses, thus affecting the safety of the packaging cell line and vector production. Therefore, the cell lines were screened for production of replication competent vector. The 208F cells were expanded to approximately 30% confluency in a T25 flask (˜10⁵ cells). The cells were then infected with 5 ml of infectious vector at 10⁵ CFU/ml+8 ug/ml polybrene and grown to confluency (˜24 h), followed by the addition of media supplemented with G418. The cells were then expand to confluency and the media collected. The media from the infected cells was used to infect new 208F cells. The cells were plated in 6-well plates at 30% confluency (˜10⁵ cells) using the following dilutions: undiluted, 1:2, 1:4, 1:6, 1:8, 1:10. Cells were expanded to confluency, followed by the addition of G418. The cells were then maintained under selection for 14 days to determine the growth of any neo resistant colonies, which indicate the presence of replication competent virus.

Example 5 Preparation of Pseudotyped Retrovirus For Infection of Host Cells

[0175] The concentrated pseudotyped retroviruses were resuspended in 0.1×HBS (2.5 mM HEPES, pH 7.12, 14 mM NaCl, 75 μM Na₂HPO₄-H₂O) and 18 μl aliquots were placed in 0.5 ml vials (Eppendorf) and stored at −80° C. until used. The titer of the concentrated vector was determined by diluting 1 μl of the concentrated virus 10⁻⁷- or 10⁻⁸-fold with 0.1×HBS. The diluted virus solution was then used to infect 208F and bovine mammary epithelial cells and viral titers were determined as described in Example 2. 8 μg/ml polybrene was added to each well. The plates were incubated for 24 hr. Media was removed from wells by aspiration with sterile Pasteur pipet and vacuum. The wells were replenished with appropriate selection medium. The media is replenished as necessary, noted by change (to yellow) in media color. In the beginning this was every two days, as fewer cells remain, the time decreased by virtue of the fact there are fewer cells. At day 10-14 (depending on selection used), the media was removed the cells were fixed with 100% methanol, 2.0 ml/well, minimum 10 minutes, washed, and stained with Giemsa stain, 2.0 ml/well, 15 minutes minimum. The number of stained colonies was counted and the titer was calculated by: average # colonies ×dilution factor=# CFU/ml.

[0176] All publications and patents mentioned in the above specification are herein incorporated by reference. Various modifications and variations of the described method and system of the invention will be apparent to those skilled in the art without departing from the scope and spirit of the invention. Although the invention has been described in connection with specific preferred embodiments, it should be understood that the invention as claimed should not be unduly limited to such specific embodiments. Indeed, various modifications of the described modes for carrying out the invention which are obvious to those skilled in molecular biology, protein fermentation, biochemistry, or related fields are intended to be within the scope of the following claims.

1 4 1 7617 DNA Artificial Sequence Synthetic 1 gaattaattc ataccagatc accgaaaact gtcctccaaa tgtgtccccc tcacactccc 60 aaattcgcgg gcttctgcct cttagaccac tctaccctat tccccacact caccggagcc 120 aaagccgcgg cccttccgtt tctttgcttt tgaaagaccc cacccgtagg tggcaagcta 180 gcttaagtaa cgccactttg caaggcatgg aaaaatacat aactgagaat agaaaagttc 240 agatcaaggt caggaacaaa gaaacagctg aataccaaac aggatatctg tggtaagcgg 300 ttcctgcccc ggctcagggc caagaacaga tgagacagct gagtgatggg ccaaacagga 360 tatctgtggt aagcagttcc tgccccggct cggggccaag aacagatggt ccccagatgc 420 ggtccagccc tcagcagttt ctagtgaatc atcagatgtt tccagggtgc cccaaggacc 480 tgaaaatgac cctgtacctt atttgaacta accaatcagt tcgcttctcg cttctgttcg 540 cgcgcttccg ctctccgagc tcaataaaag agcccacaac ccctcactcg gcgcgccagt 600 cttccgatag actgcgtcgc ccgggtaccc gtattcccaa taaagcctct tgctgtttgc 660 atccgaatcg tggtctcgct gttccttggg agggtctcct ctgagtgatt gactacccac 720 gacgggggtc tttcatttgg gggctcgtcc gggatttgga gacccctgcc cagggaccac 780 cgacccacca ccgggaggta agctggccag caacttatct gtgtctgtcc gattgtctag 840 tgtctatgtt tgatgttatg cgcctgcgtc tgtactagtt agctaactag ctctgtatct 900 ggcggacccg tggtggaact gacgagttct gaacacccgg ccgcaaccct gggagacgtc 960 ccagggactt tgggggccgt ttttgtggcc cgacctgagg aagggagtcg atgtggaatc 1020 cgaccccgtc aggatatgtg gttctggtag gagacgagaa cctaaaacag ttcccgcctc 1080 cgtctgaatt tttgctttcg gtttggaacc gaagccgcgc gtcttgtctg ctgcagcgct 1140 gcagcatcgt tctgtgttgt ctctgtctga ctgtgtttct gtatttgtct gaaaattagg 1200 gccagactgt taccactccc ttaagtttga ccttaggtca ctggaaagat gtcgagcgga 1260 tcgctcacaa ccagtcggta gatgtcaaga agagacgttg ggttaccttc tgctctgcag 1320 aatggccaac ctttaacgtc ggatggccgc gagacggcac ctttaaccga gacctcatca 1380 cccaggttaa gatcaaggtc ttttcacctg gcccgcatgg acacccagac caggtcccct 1440 acatcgtgac ctgggaagcc ttggcttttg acccccctcc ctgggtcaag ccctttgtac 1500 accctaagcc tccgcctcct cttcctccat ccgccccgtc tctccccctt gaacctcctc 1560 gttcgacccc gcctcgatcc tccctttatc cagccctcac tccttctcta ggcgccggaa 1620 ttccgatctg atcaagagac aggatgaggg agcttgtata tccattttcg gatctgatca 1680 gcacgtgttg acaattaatc atcggcatag tatatcggca tagtataata cgacaaggtg 1740 aggaactaaa ccatggccaa gcctttgtct caagaagaat ccaccctcat tgaaagagca 1800 acggctacaa tcaacagcat ccccatctct gaagactaca gcgtcgccag cgcagctctc 1860 tctagcgacg gccgcatctt cactggtgtc aatgtatatc attttactgg gggaccttgt 1920 gcagaactcg tggtgctggg cactgctgct gctgcggcag ctggcaacct gacttgtatc 1980 gtcgcgatcg gaaatgagaa caggggcatc ttgagcccct gcggacggtg tcgacaggtg 2040 cttctcgatc tgcatcctgg gatcaaagcg atagtgaagg acagtgatgg acagccgacg 2100 gcagttggga ttcgtgaatt gctgccctct ggttatgtgt gggagggcta agcacttcgt 2160 ggccgaggag caggactgac acgtgctacg agatttcgat tccaccgccg ccttctatga 2220 aaggttgggc ttcggaatcg ttttccggga cgccgatccg gccattagcc atattattca 2280 ttggttatat agcataaatc aatattggct attggccatt gcatacgttg tatccatatc 2340 ataatatgta catttatatt ggctcatgtc caacattacc gccatgttga cattgattat 2400 tgactagtta ttaatagtaa tcaattacgg ggtcattagt tcatagccca tatatggagt 2460 tccgcgttac ataacttacg gtaaatggcc cgcctggctg accgcccaac gacccccgcc 2520 cattgacgtc aataatgacg tatgttccca tagtaacgcc aatagggact ttccattgac 2580 gtcaatgggt ggagtattta cggtaaactg cccacttggc agtacatcaa gtgtatcata 2640 tgccaagtac gccccctatt gacgtcaatg acggtaaatg gcccgcctgg cattatgccc 2700 agtacatgac cttatgggac tttcctactt ggcagtacat ctacgtatta gtcatcgcta 2760 ttaccatggt gatgcggttt tggcagtaca tcaatgggcg tggatagcgg tttgactcac 2820 ggggatttcc aagtctccac cccattgacg tcaatgggag tttgttttgg caccaaaatc 2880 aacgggactt tccaaaatgt cgtaacaact ccgccccatt gacgcaaatg ggcggtaggc 2940 atgtacggtg ggaggtctat ataagcagag ctcgtttagt gaaccgtcag atcgcctgga 3000 gacgccatcc acgctgtttt gacctccata gaagacaccg ggaccgatcc agcctccgcg 3060 gccccaagct tgttatcaca agtttgtaca aaaaagcagg cttcgaagga gatagaacca 3120 attctctaag gaaatactta accatgggat ggagctgtat catcctcttc ttggtagcaa 3180 cagctacagg tgtccactcc caggtccagc tggtccaatc aggggctgaa gtcaagaaac 3240 ctgggtcatc agtgaaggtc tcctgcaagg cttctggcta cacctttact agctactggc 3300 tgcactgggt caggcaggca cctggacagg gtctggaatg gattggatac attaatccta 3360 ggaatgatta tactgagtac aatcagaact tcaaggacaa ggccacaata actgcagacg 3420 aatccaccaa tacagcctac atggagctga gcagcctgag gtctgaggac acggcatttt 3480 atttttgtgc aagaagggat attactacgt tctactgggg ccaaggcacc acggtcaccg 3540 tctcctcagc ctccaccaag ggcccatcgg tcttccccct ggcaccctcc tccaagagca 3600 cctctggggg cacagcggcc ctgggctgcc tggtcaagga ctacttcccc gaaccggtga 3660 cggtgtcgtg gaactcaggc gccctgacca gcggcgtgca caccttcccg gctgtcctac 3720 agtcctcagg actctactcc ctcagcagcg tggtgaccgt gccctccagc agcttgggca 3780 cccagaccta catctgcaac gtgaatcaca agcccagcaa caccaaggtg gacaagagag 3840 ttgagcccaa atcttgtgac aaaactcaca catgcccacc gtgcccagca cctgaactcc 3900 tggggggacc gtcagtcttc ctcttccccc caaaacccaa ggacaccctc atgatctccc 3960 ggacccctga ggtcacatgc gtggtggtgg acgtgagcca cgaagaccct gaggtcaagt 4020 tcaactggta cgtggacggc gtggaggtgc ataatgccaa gacaaagccg cgggaggagc 4080 agtacaacag cacgtaccgt gtggtcagcg tcctcaccgt cctgcaccag gactggctga 4140 atggcaagga gtacaagtgc aaggtctcca acaaagccct cccagccccc atcgagaaaa 4200 ccatctccaa agccaaaggg cagccccgag aaccacaggt gtacaccctg cccccatccc 4260 gggaggagat gaccaagaac caggtcagcc tgacctgcct ggtcaaaggc ttctatccca 4320 gcgacatcgc cgtggagtgg gagagcaatg ggcagccgga gaacaactac aagaccacgc 4380 ctcccgtgct ggactccgac ggctccttct tcctctatag caagctcacc gtggacaaga 4440 gcaggtggca gcaggggaac gtcttctcat gctccgtgat gcacgaggct ctgcacaacc 4500 actacacgca gaagagcctc tccctgtctc ccgggaaatg aaagccgaat tcgcggccgc 4560 actcgagata tctagaccca gctttcttgt acaaagtggt gataacatcg ataaaataaa 4620 agattttatt tagtctccag aaaaaggggg gaatgaaaga ccccacctgt aggtttggca 4680 agctagctta agtaacgcca ttttgcaagg catggaaaaa tacataactg agaatagaga 4740 agttcagatc aaggtcagga acagatggaa cagctgaata tgggccaaac aggatatctg 4800 tggtaagcag ttcctgcccc ggctcagggc caagaacaga tggaacagct gaatatgggc 4860 caaacaggat atctgtggta agcagttcct gccccggctc agggccaaga acagatggtc 4920 cccagatgcg gtccagccct cagcagtttc tagagaacca tcagatgttt ccagggtgcc 4980 ccaaggacct gaaatgaccc tgtgccttat ttgaactaac caatcagttc gcttctcgct 5040 tctgttcgcg cgcttctgct ccccgagctc aataaaagag cccacaaccc ctcactcggg 5100 gcgccagtcc tccgattgac tgagtcgccc gggtacccgt gtatccaata aaccctcttg 5160 cagttgcatc cgacttgtgg tctcgctgtt ccttgggagg gtctcctctg agtgattgac 5220 tacccgtcag cgggggtctt tcatttttcc attgggggct cgtccgggat cgggagaccc 5280 ctgcccaggg accaccgacc caccaccggg aggtaagctg gctgcctcgc gcgtttcggt 5340 gatgacggtg aaaacctctg acacatgcag ctcccggaga cggtcacagc ttgtctgtaa 5400 gcggatgccg ggagcagaca agcccgtcag ggcgcgtcag cgggtgttgg cgggtgtcgg 5460 ggcgcagcca tgacccagtc acgtagcgat agcggagtgt atactggctt aactatgcgg 5520 catcagagca gattgtactg agagtgcacc atatgcggtg tgaaataccg cacagatgcg 5580 taaggagaaa ataccgcatc aggcgctctt ccgcttcctc gctcactgac tcgctgcgct 5640 cggtcgttcg gctgcggcga gcggtatcag ctcactcaaa ggcggtaata cggttatcca 5700 cagaatcagg ggataacgca ggaaagaaca tgtgagcaaa aggccagcaa aaggccagga 5760 accgtaaaaa ggccgcgttg ctggcgtttt tccataggct ccgcccccct gacgagcatc 5820 acaaaaatcg acgctcaagt cagaggtggc gaaacccgac aggactataa agataccagg 5880 cgtttccccc tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat 5940 acctgtccgc ctttctccct tcgggaagcg tggcgctttc tcatagctca cgctgtaggt 6000 atctcagttc ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa ccccccgttc 6060 agcccgaccg ctgcgcctta tccggtaact atcgtcttga gtccaacccg gtaagacacg 6120 acttatcgcc actggcagca gccactggta acaggattag cagagcgagg tatgtaggcg 6180 gtgctacaga gttcttgaag tggtggccta actacggcta cactagaagg acagtatttg 6240 gtatctgcgc tctgctgaag ccagttacct tcggaaaaag agttggtagc tcttgatccg 6300 gcaaacaaac caccgctggt agcggtggtt tttttgtttg caagcagcag attacgcgca 6360 gaaaaaaagg atctcaagaa gatcctttga tcttttctac ggggtctgac gctcagtgga 6420 acgaaaactc acgttaaggg attttggtca tgagattatc aaaaaggatc ttcacctaga 6480 tccttttaaa ttaaaaatga agttttaaat caatctaaag tatatatgag taaacttggt 6540 ctgacagtta ccaatgctta atcagtgagg cacctatctc agcgatctgt ctatttcgtt 6600 catccatagt tgcctgactc cccgtcgtgt agataactac gatacgggag ggcttaccat 6660 ctggccccag tgctgcaatg ataccgcgag acccacgctc accggctcca gatttatcag 6720 caataaacca gccagccgga agggccgagc gcagaagtgg tcctgcaact ttatccgcct 6780 ccatccagtc tattaattgt tgccgggaag ctagagtaag tagttcgcca gttaatagtt 6840 tgcgcaacgt tgttgccatt gctgcaggca tcgtggtgtc acgctcgtcg tttggtatgg 6900 cttcattcag ctccggttcc caacgatcaa ggcgagttac atgatccccc atgttgtgca 6960 aaaaagcggt tagctccttc ggtcctccga tcgttgtcag aagtaagttg gccgcagtgt 7020 tatcactcat ggttatggca gcactgcata attctcttac tgtcatgcca tccgtaagat 7080 gcttttctgt gactggtgag tactcaacca agtcattctg agaatagtgt atgcggcgac 7140 cgagttgctc ttgcccggcg tcaacacggg ataataccgc gccacatagc agaactttaa 7200 aagtgctcat cattggaaaa cgttcttcgg ggcgaaaact ctcaaggatc ttaccgctgt 7260 tgagatccag ttcgatgtaa cccactcgtg cacccaactg atcttcagca tcttttactt 7320 tcaccagcgt ttctgggtga gcaaaaacag gaaggcaaaa tgccgcaaaa aagggaataa 7380 gggcgacacg gaaatgttga atactcatac tcttcctttt tcaatattat tgaagcattt 7440 atcagggtta ttgtctcatg agcggataca tatttgaatg tatttagaaa aataaacaaa 7500 taggggttcc gcgcacattt ccccgaaaag tgccacctga cgtctaagaa accattatta 7560 tcatgacatt aacctataaa aataggcgta tcacgaggcc ctttcgtctt caagaat 7617 2 7626 DNA Artificial Sequence Synthetic 2 gaattaattc ataccagatc accgaaaact gtcctccaaa tgtgtccccc tcacactccc 60 aaattcgcgg gcttctgcct cttagaccac tctaccctat tccccacact caccggagcc 120 aaagccgcgg cccttccgtt tctttgcttt tgaaagaccc cacccgtagg tggcaagcta 180 gcttaagtaa cgccactttg caaggcatgg aaaaatacat aactgagaat agaaaagttc 240 agatcaaggt caggaacaaa gaaacagctg aataccaaac aggatatctg tggtaagcgg 300 ttcctgcccc ggctcagggc caagaacaga tgagacagct gagtgatggg ccaaacagga 360 tatctgtggt aagcagttcc tgccccggct cggggccaag aacagatggt ccccagatgc 420 ggtccagccc tcagcagttt ctagtgaatc atcagatgtt tccagggtgc cccaaggacc 480 tgaaaatgac cctgtacctt atttgaacta accaatcagt tcgcttctcg cttctgttcg 540 cgcgcttccg ctctccgagc tcaataaaag agcccacaac ccctcactcg gcgcgccagt 600 cttccgatag actgcgtcgc ccgggtaccc gtattcccaa taaagcctct tgctgtttgc 660 atccgaatcg tggtctcgct gttccttggg agggtctcct ctgagtgatt gactacccac 720 gacgggggtc tttcatttgg gggctcgtcc gggatttgga gacccctgcc cagggaccac 780 cgacccacca ccgggaggta agctggccag caacttatct gtgtctgtcc gattgtctag 840 tgtctatgtt tgatgttatg cgcctgcgtc tgtactagtt agctaactag ctctgtatct 900 ggcggacccg tggtggaact gacgagttct gaacacccgg ccgcaaccct gggagacgtc 960 ccagggactt tgggggccgt ttttgtggcc cgacctgagg aagggagtcg atgtggaatc 1020 cgaccccgtc aggatatgtg gttctggtag gagacgagaa cctaaaacag ttcccgcctc 1080 cgtctgaatt tttgctttcg gtttggaacc gaagccgcgc gtcttgtctg ctgcagcgct 1140 gcagcatcgt tctgtgttgt ctctgtctga ctgtgtttct gtatttgtct gaaaattagg 1200 gccagactgt taccactccc ttaagtttga ccttaggtca ctggaaagat gtcgagcgga 1260 tcgctcacaa ccagtcggta gatgtcaaga agagacgttg ggttaccttc tgctctgcag 1320 aatggccaac ctttaacgtc ggatggccgc gagacggcac ctttaaccga gacctcatca 1380 cccaggttaa gatcaaggtc ttttcacctg gcccgcatgg acacccagac caggtcccct 1440 acatcgtgac ctgggaagcc ttggcttttg acccccctcc ctgggtcaag ccctttgtac 1500 accctaagcc tccgcctcct cttcctccat ccgccccgtc tctccccctt gaacctcctc 1560 gttcgacccc gcctcgatcc tccctttatc cagccctcac tccttctcta ggcgccggaa 1620 ttccgatctg atcaagagac aggatgaggg agcttgtata tccattttcg gatctgatca 1680 gcacgtgttg acaattaatc atcggcatag tatatcggca tagtataata cgacaaggtg 1740 aggaactaaa ccatggccaa gcctttgtct caagaagaat ccaccctcat tgaaagagca 1800 acggctacaa tcaacagcat ccccatctct gaagactaca gcgtcgccag cgcagctctc 1860 tctagcgacg gccgcatctt cactggtgtc aatgtatatc attttactgg gggaccttgt 1920 gcagaactcg tggtgctggg cactgctgct gctgcggcag ctggcaacct gacttgtatc 1980 gtcgcgatcg gaaatgagaa caggggcatc ttgagcccct gcggacggtg tcgacaggtg 2040 cttctcgatc tgcatcctgg gatcaaagcg atagtgaagg acagtgatgg acagccgacg 2100 gcagttggga ttcgtgaatt gctgccctct ggttatgtgt gggagggcta agcacttcgt 2160 ggccgaggag caggactgac acgtgctacg agatttcgat tccaccgccg ccttctatga 2220 aaggttgggc ttcggaatcg ttttccggga cgccgatccg gccattagcc atattattca 2280 ttggttatat agcataaatc aatattggct attggccatt gcatacgttg tatccatatc 2340 ataatatgta catttatatt ggctcatgtc caacattacc gccatgttga cattgattat 2400 tgactagtta ttaatagtaa tcaattacgg ggtcattagt tcatagccca tatatggagt 2460 tccgcgttac ataacttacg gtaaatggcc cgcctggctg accgcccaac gacccccgcc 2520 cattgacgtc aataatgacg tatgttccca tagtaacgcc aatagggact ttccattgac 2580 gtcaatgggt ggagtattta cggtaaactg cccacttggc agtacatcaa gtgtatcata 2640 tgccaagtac gccccctatt gacgtcaatg acggtaaatg gcccgcctgg cattatgccc 2700 agtacatgac cttatgggac tttcctactt ggcagtacat ctacgtatta gtcatcgcta 2760 ttaccatggt gatgcggttt tggcagtaca tcaatgggcg tggatagcgg tttgactcac 2820 ggggatttcc aagtctccac cccattgacg tcaatgggag tttgttttgg caccaaaatc 2880 aacgggactt tccaaaatgt cgtaacaact ccgccccatt gacgcaaatg ggcggtaggc 2940 atgtacggtg ggaggtctat ataagcagag ctcgtttagt gaaccgtcag atcgcctgga 3000 gacgccatcc acgctgtttt gacctccata gaagacaccg ggaccgatcc agcctccgcg 3060 gccccaagct tgttatcaca agtttgtaca aaaaagcagg cttcgaagga gatagaacca 3120 attctctaag gaaatactta accatgggat ggagctgtat catcctcttc ttggtagcaa 3180 cagctacagg tgtccactcc gaggtccaac tggtggagag cggtggaggt gttgtgcaac 3240 ctggccggtc cctgcgcctg tcctgctccg catctggctt cgatttcacc acatattgga 3300 tgagttgggt gagacaggca cctggaaaag gtcttgagtg gattggagaa attcatccag 3360 atagcagtac gattaactat gcgccgtctc taaaggatag atttacaata tcgcgagaca 3420 acgccaagaa cacattgttc ctgcaaatgg acagcctgag acccgaagac accggggtct 3480 atttttgtgc aagcctttac ttcggcttcc cctggtttgc ttattggggc caagggaccc 3540 cggtcaccgt ctcctcagcc tccaccaagg gcccatcggt cttccccctg gcaccctcct 3600 ccaagagcac ctctgggggc acagcggccc tgggctgcct ggtcaaggac tacttccccg 3660 aaccggtgac ggtgtcgtgg aactcaggcg ccctgaccag cggcgtgcac accttcccgg 3720 ctgtcctaca gtcctcagga ctctactccc tcagcagcgt ggtgaccgtg ccctccagca 3780 gcttgggcac ccagacctac atctgcaacg tgaatcacaa gcccagcaac accaaggtgg 3840 acaagagagt tgagcccaaa tcttgtgaca aaactcacac atgcccaccg tgcccagcac 3900 ctgaactcct ggggggaccg tcagtcttcc tcttcccccc aaaacccaag gacaccctca 3960 tgatctcccg gacccctgag gtcacatgcg tggtggtgga cgtgagccac gaagaccctg 4020 aggtcaagtt caactggtac gtggacggcg tggaggtgca taatgccaag acaaagccgc 4080 gggaggagca gtacaacagc acgtaccgtg tggtcagcgt cctcaccgtc ctgcaccagg 4140 actggctgaa tggcaaggag tacaagtgca aggtctccaa caaagccctc ccagccccca 4200 tcgagaaaac catctccaaa gccaaagggc agccccgaga accacaggtg tacaccctgc 4260 ccccatcccg ggaggagatg accaagaacc aggtcagcct gacctgcctg gtcaaaggct 4320 tctatcccag cgacatcgcc gtggagtggg agagcaatgg gcagccggag aacaactaca 4380 agaccacgcc tcccgtgctg gactccgacg gctccttctt cctctatagc aagctcaccg 4440 tggacaagag caggtggcag caggggaacg tcttctcatg ctccgtgatg cacgaggctc 4500 tgcacaacca ctacacgcag aagagcctct ccctgtctcc cgggaaatga aagccgaatt 4560 cgcggccgca ctcgagatat ctagacccag ctttcttgta caaagtggtg ataacatcga 4620 taaaataaaa gattttattt agtctccaga aaaagggggg aatgaaagac cccacctgta 4680 ggtttggcaa gctagcttaa gtaacgccat tttgcaaggc atggaaaaat acataactga 4740 gaatagagaa gttcagatca aggtcaggaa cagatggaac agctgaatat gggccaaaca 4800 ggatatctgt ggtaagcagt tcctgccccg gctcagggcc aagaacagat ggaacagctg 4860 aatatgggcc aaacaggata tctgtggtaa gcagttcctg ccccggctca gggccaagaa 4920 cagatggtcc ccagatgcgg tccagccctc agcagtttct agagaaccat cagatgtttc 4980 cagggtgccc caaggacctg aaatgaccct gtgccttatt tgaactaacc aatcagttcg 5040 cttctcgctt ctgttcgcgc gcttctgctc cccgagctca ataaaagagc ccacaacccc 5100 tcactcgggg cgccagtcct ccgattgact gagtcgcccg ggtacccgtg tatccaataa 5160 accctcttgc agttgcatcc gacttgtggt ctcgctgttc cttgggaggg tctcctctga 5220 gtgattgact acccgtcagc gggggtcttt catttttcca ttgggggctc gtccgggatc 5280 gggagacccc tgcccaggga ccaccgaccc accaccggga ggtaagctgg ctgcctcgcg 5340 cgtttcggtg atgacggtga aaacctctga cacatgcagc tcccggagac ggtcacagct 5400 tgtctgtaag cggatgccgg gagcagacaa gcccgtcagg gcgcgtcagc gggtgttggc 5460 gggtgtcggg gcgcagccat gacccagtca cgtagcgata gcggagtgta tactggctta 5520 actatgcggc atcagagcag attgtactga gagtgcacca tatgcggtgt gaaataccgc 5580 acagatgcgt aaggagaaaa taccgcatca ggcgctcttc cgcttcctcg ctcactgact 5640 cgctgcgctc ggtcgttcgg ctgcggcgag cggtatcagc tcactcaaag gcggtaatac 5700 ggttatccac agaatcaggg gataacgcag gaaagaacat gtgagcaaaa ggccagcaaa 5760 aggccaggaa ccgtaaaaag gccgcgttgc tggcgttttt ccataggctc cgcccccctg 5820 acgagcatca caaaaatcga cgctcaagtc agaggtggcg aaacccgaca ggactataaa 5880 gataccaggc gtttccccct ggaagctccc tcgtgcgctc tcctgttccg accctgccgc 5940 ttaccggata cctgtccgcc tttctccctt cgggaagcgt ggcgctttct catagctcac 6000 gctgtaggta tctcagttcg gtgtaggtcg ttcgctccaa gctgggctgt gtgcacgaac 6060 cccccgttca gcccgaccgc tgcgccttat ccggtaacta tcgtcttgag tccaacccgg 6120 taagacacga cttatcgcca ctggcagcag ccactggtaa caggattagc agagcgaggt 6180 atgtaggcgg tgctacagag ttcttgaagt ggtggcctaa ctacggctac actagaagga 6240 cagtatttgg tatctgcgct ctgctgaagc cagttacctt cggaaaaaga gttggtagct 6300 cttgatccgg caaacaaacc accgctggta gcggtggttt ttttgtttgc aagcagcaga 6360 ttacgcgcag aaaaaaagga tctcaagaag atcctttgat cttttctacg gggtctgacg 6420 ctcagtggaa cgaaaactca cgttaaggga ttttggtcat gagattatca aaaaggatct 6480 tcacctagat ccttttaaat taaaaatgaa gttttaaatc aatctaaagt atatatgagt 6540 aaacttggtc tgacagttac caatgcttaa tcagtgaggc acctatctca gcgatctgtc 6600 tatttcgttc atccatagtt gcctgactcc ccgtcgtgta gataactacg atacgggagg 6660 gcttaccatc tggccccagt gctgcaatga taccgcgaga cccacgctca ccggctccag 6720 atttatcagc aataaaccag ccagccggaa gggccgagcg cagaagtggt cctgcaactt 6780 tatccgcctc catccagtct attaattgtt gccgggaagc tagagtaagt agttcgccag 6840 ttaatagttt gcgcaacgtt gttgccattg ctgcaggcat cgtggtgtca cgctcgtcgt 6900 ttggtatggc ttcattcagc tccggttccc aacgatcaag gcgagttaca tgatccccca 6960 tgttgtgcaa aaaagcggtt agctccttcg gtcctccgat cgttgtcaga agtaagttgg 7020 ccgcagtgtt atcactcatg gttatggcag cactgcataa ttctcttact gtcatgccat 7080 ccgtaagatg cttttctgtg actggtgagt actcaaccaa gtcattctga gaatagtgta 7140 tgcggcgacc gagttgctct tgcccggcgt caacacggga taataccgcg ccacatagca 7200 gaactttaaa agtgctcatc attggaaaac gttcttcggg gcgaaaactc tcaaggatct 7260 taccgctgtt gagatccagt tcgatgtaac ccactcgtgc acccaactga tcttcagcat 7320 cttttacttt caccagcgtt tctgggtgag caaaaacagg aaggcaaaat gccgcaaaaa 7380 agggaataag ggcgacacgg aaatgttgaa tactcatact cttccttttt caatattatt 7440 gaagcattta tcagggttat tgtctcatga gcggatacat atttgaatgt atttagaaaa 7500 ataaacaaat aggggttccg cgcacatttc cccgaaaagt gccacctgac gtctaagaaa 7560 ccattattat catgacatta acctataaaa ataggcgtat cacgaggccc tttcgtcttc 7620 aagaat 7626 3 7490 DNA Artificial Sequence Synthetic 3 gaattaattc ataccagatc accgaaaact gtcctccaaa tgtgtccccc tcacactccc 60 aaattcgcgg gcttctgcct cttagaccac tctaccctat tccccacact caccggagcc 120 aaagccgcgg cccttccgtt tctttgcttt tgaaagaccc cacccgtagg tggcaagcta 180 gcttaagtaa cgccactttg caaggcatgg aaaaatacat aactgagaat agaaaagttc 240 agatcaaggt caggaacaaa gaaacagctg aataccaaac aggatatctg tggtaagcgg 300 ttcctgcccc ggctcagggc caagaacaga tgagacagct gagtgatggg ccaaacagga 360 tatctgtggt aagcagttcc tgccccggct cggggccaag aacagatggt ccccagatgc 420 ggtccagccc tcagcagttt ctagtgaatc atcagatgtt tccagggtgc cccaaggacc 480 tgaaaatgac cctgtacctt atttgaacta accaatcagt tcgcttctcg cttctgttcg 540 cgcgcttccg ctctccgagc tcaataaaag agcccacaac ccctcactcg gcgcgccagt 600 cttccgatag actgcgtcgc ccgggtaccc gtattcccaa taaagcctct tgctgtttgc 660 atccgaatcg tggtctcgct gttccttggg agggtctcct ctgagtgatt gactacccac 720 gacgggggtc tttcatttgg gggctcgtcc gggatttgga gacccctgcc cagggaccac 780 cgacccacca ccgggaggta agctggccag caacttatct gtgtctgtcc gattgtctag 840 tgtctatgtt tgatgttatg cgcctgcgtc tgtactagtt agctaactag ctctgtatct 900 ggcggacccg tggtggaact gacgagttct gaacacccgg ccgcaaccct gggagacgtc 960 ccagggactt tgggggccgt ttttgtggcc cgacctgagg aagggagtcg atgtggaatc 1020 cgaccccgtc aggatatgtg gttctggtag gagacgagaa cctaaaacag ttcccgcctc 1080 cgtctgaatt tttgctttcg gtttggaacc gaagccgcgc gtcttgtctg ctgcagcgct 1140 gcagcatcgt tctgtgttgt ctctgtctga ctgtgtttct gtatttgtct gaaaattagg 1200 gccagactgt taccactccc ttaagtttga ccttaggtca ctggaaagat gtcgagcgga 1260 tcgctcacaa ccagtcggta gatgtcaaga agagacgttg ggttaccttc tgctctgcag 1320 aatggccaac ctttaacgtc ggatggccgc gagacggcac ctttaaccga gacctcatca 1380 cccaggttaa gatcaaggtc ttttcacctg gcccgcatgg acacccagac caggtcccct 1440 acatcgtgac ctgggaagcc ttggcttttg acccccctcc ctgggtcaag ccctttgtac 1500 accctaagcc tccgcctcct cttcctccat ccgccccgtc tctccccctt gaacctcctc 1560 gttcgacccc gcctcgatcc tccctttatc cagccctcac tccttctcta ggcgccggaa 1620 ttccgatctg atcaagagac aggatgagga tcgtttcgca tgattgaaca agatggattg 1680 cacgcaggtt ctccggccgc ttgggtggag aggctattcg gctatgactg ggcacaacag 1740 acaatcggct gctctgatgc cgccgtgttc cggctgtcag cgcaggggcg cccggttctt 1800 tttgtcaaga ccgacctgtc cggtgccctg aatgaactgc aggacgaggc agcgcggcta 1860 tcgtggctgg ccacgacggg cgttccttgc gcagctgtgc tcgacgttgt cactgaagcg 1920 ggaagggact ggctgctatt gggcgaagtg ccggggcagg atctcctgtc atctcacctt 1980 gctcctgccg agaaagtatc catcatggct gatgcaatgc ggcggctgca tacgcttgat 2040 ccggctacct gcccattcga ccaccaagcg aaacatcgca tcgagcgagc acgtactcgg 2100 atggaagccg gtcttgtcga tcaggatgat ctggacgaag agcatcaggg gctcgcgcca 2160 gccgaactgt tcgccaggct caaggcgcgc atgcccgacg gcgaggatct cgtcgtgacc 2220 catggcgatg cctgcttgcc gaatatcatg gtggaaaatg gccgcttttc tggattcatc 2280 gactgtggcc ggctgggtgt ggcggaccgc tatcaggaca tagcgttggc tacccgtgat 2340 attgctgaag agcttggcgg cgaatgggct gaccgcttcc tcgtgcttta cggtatcgcc 2400 gctcccgatt cgcagcgcat cgccttctat cgccttcttg acgagttctt ctgagcggga 2460 ctctggggtt cgaaatgacc gaccaagcga cgcccaacct gccatcacga gatttcgatt 2520 ccaccgccgc cttctatgaa aggttgggct tcggaatcgt tttccgggac gccggctgga 2580 tgatcctcca gcgcggggat ctcatgctgg agttcttcgc ccaccccggg ctcgatcccc 2640 tcgcgagttg gttcagctgc tgcctgaggc tggacgacct cgcggagttc taccggcagt 2700 gcaaatccgt cggcatccag gaaaccagca gcggctatcc gcgcatccat gcccccgaac 2760 tgcaggagtg gggaggcacg atggccgctt tggtcgaggc ggatccggcc attagccata 2820 ttattcattg gttatatagc ataaatcaat attggctatt ggccattgca tacgttgtat 2880 ccatatcata atatgtacat ttatattggc tcatgtccaa cattaccgcc atgttgacat 2940 tgattattga ctagttatta atagtaatca attacggggt cattagttca tagcccatat 3000 atggagttcc gcgttacata acttacggta aatggcccgc ctggctgacc gcccaacgac 3060 ccccgcccat tgacgtcaat aatgacgtat gttcccatag taacgccaat agggactttc 3120 cattgacgtc aatgggtgga gtatttacgg taaactgccc acttggcagt acatcaagtg 3180 tatcatatgc caagtacgcc ccctattgac gtcaatgacg gtaaatggcc cgcctggcat 3240 tatgcccagt acatgacctt atgggacttt cctacttggc agtacatcta cgtattagtc 3300 atcgctatta ccatggtgat gcggttttgg cagtacatca atgggcgtgg atagcggttt 3360 gactcacggg gatttccaag tctccacccc attgacgtca atgggagttt gttttggcac 3420 caaaatcaac gggactttcc aaaatgtcgt aacaactccg ccccattgac gcaaatgggc 3480 ggtaggcatg tacggtggga ggtctatata agcagagctc gtttagtgaa ccgtcagatc 3540 gcctggagac gccatccacg ctgttttgac ctccatagaa gacaccggga ccgatccagc 3600 ctccgcggcc ccaagcttgt tatcacaagt ttgtacaaaa aagcaggctt cgaaggagat 3660 agaaccaatt ctctaaggaa atacttaacg tcgactggat ccggtaccga attcggcgcc 3720 gccaccatga tgtcctttgt ctctctgctc ctggtaggca tcctattcca tgccacccag 3780 gccgacatcc agctgaccca gtctccatca tctctgagcg catctgttgg agatagggtc 3840 actatgagct gtaagtccag tcaaagtgtt ttatacagtg caaatcacaa gaactacttg 3900 gcctggtacc agcagaaacc agggaaagca cctaaactgc tgatctactg ggcatccact 3960 agggaatctg gtgtcccttc gcgattctct ggcagcggat ctgggacaga ttttactttc 4020 accatcagct ctcttcaacc agaagacatt gcaacatatt attgtcacca atacctctcc 4080 tcgtggacgt tcggtggagg gaccaaggtg cagatcaaac gaactgtggc tgcaccatct 4140 gtcttcatct tcccgccatc tgatgagcag ttgaaatctg gaactgcctc tgttgtgtgc 4200 ctgctgaata acttctatcc cagagaggcc aaagtacagt ggaaggtgga taacgccctc 4260 caatcgggta actcccagga gagtgtcaca gagcaggaca gcaaggacag cacctacagc 4320 ctcagcagca ccctgacgct gagcaaagca gactacgaga aacacaaagt ctacgcctgc 4380 gaagtcaccc atcagggcct gagctcgccc gtcacaaaga gcttcaacag gggagagtgt 4440 tagatctcga gatatctaga cccagctttc ttgtacaaag tggtgataac atcgataaaa 4500 taaaagattt tatttagtct ccagaaaaag gggggaatga aagaccccac ctgtaggttt 4560 ggcaagctag cttaagtaac gccattttgc aaggcatgga aaaatacata actgagaata 4620 gagaagttca gatcaaggtc aggaacagat ggaacagctg aatatgggcc aaacaggata 4680 tctgtggtaa gcagttcctg ccccggctca gggccaagaa cagatggaac agctgaatat 4740 gggccaaaca ggatatctgt ggtaagcagt tcctgccccg gctcagggcc aagaacagat 4800 ggtccccaga tgcggtccag ccctcagcag tttctagaga accatcagat gtttccaggg 4860 tgccccaagg acctgaaatg accctgtgcc ttatttgaac taaccaatca gttcgcttct 4920 cgcttctgtt cgcgcgcttc tgctccccga gctcaataaa agagcccaca acccctcact 4980 cggggcgcca gtcctccgat tgactgagtc gcccgggtac ccgtgtatcc aataaaccct 5040 cttgcagttg catccgactt gtggtctcgc tgttccttgg gagggtctcc tctgagtgat 5100 tgactacccg tcagcggggg tctttcattt gggggctcgt ccgggatcgg gagacccctg 5160 cccagggacc accgacccac caccgggagg taagctggct gcctcgcgcg tttcggtgat 5220 gacggtgaaa acctctgaca catgcagctc ccggagacgg tcacagcttg tctgtaagcg 5280 gatgccggga gcagacaagc ccgtcagggc gcgtcagcgg gtgttggcgg gtgtcggggc 5340 gcagccatga cccagtcacg tagcgatagc ggagtgtata ctggcttaac tatgcggcat 5400 cagagcagat tgtactgaga gtgcaccata tgcggtgtga aataccgcac agatgcgtaa 5460 ggagaaaata ccgcatcagg cgctcttccg cttcctcgct cactgactcg ctgcgctcgg 5520 tcgttcggct gcggcgagcg gtatcagctc actcaaaggc ggtaatacgg ttatccacag 5580 aatcagggga taacgcagga aagaacatgt gagcaaaagg ccagcaaaag gccaggaacc 5640 gtaaaaaggc cgcgttgctg gcgtttttcc ataggctccg cccccctgac gagcatcaca 5700 aaaatcgacg ctcaagtcag aggtggcgaa acccgacagg actataaaga taccaggcgt 5760 ttccccctgg aagctccctc gtgcgctctc ctgttccgac cctgccgctt accggatacc 5820 tgtccgcctt tctcccttcg ggaagcgtgg cgctttctca tagctcacgc tgtaggtatc 5880 tcagttcggt gtaggtcgtt cgctccaagc tgggctgtgt gcacgaaccc cccgttcagc 5940 ccgaccgctg cgccttatcc ggtaactatc gtcttgagtc caacccggta agacacgact 6000 tatcgccact ggcagcagcc actggtaaca ggattagcag agcgaggtat gtaggcggtg 6060 ctacagagtt cttgaagtgg tggcctaact acggctacac tagaaggaca gtatttggta 6120 tctgcgctct gctgaagcca gttaccttcg gaaaaagagt tggtagctct tgatccggca 6180 aacaaaccac cgctggtagc ggtggttttt ttgtttgcaa gcagcagatt acgcgcagaa 6240 aaaaaggatc tcaagaagat cctttgatct tttctacggg gtctgacgct cagtggaacg 6300 aaaactcacg ttaagggatt ttggtcatga gattatcaaa aaggatcttc acctagatcc 6360 ttttaaatta aaaatgaagt tttaaatcaa tctaaagtat atatgagtaa acttggtctg 6420 acagttacca atgcttaatc agtgaggcac ctatctcagc gatctgtcta tttcgttcat 6480 ccatagttgc ctgactcccc gtcgtgtaga taactacgat acgggagggc ttaccatctg 6540 gccccagtgc tgcaatgata ccgcgagacc cacgctcacc ggctccagat ttatcagcaa 6600 taaaccagcc agccggaagg gccgagcgca gaagtggtcc tgcaacttta tccgcctcca 6660 tccagtctat taattgttgc cgggaagcta gagtaagtag ttcgccagtt aatagtttgc 6720 gcaacgttgt tgccattgct gcaggcatcg tggtgtcacg ctcgtcgttt ggtatggctt 6780 cattcagctc cggttcccaa cgatcaaggc gagttacatg atcccccatg ttgtgcaaaa 6840 aagcggttag ctccttcggt cctccgatcg ttgtcagaag taagttggcc gcagtgttat 6900 cactcatggt tatggcagca ctgcataatt ctcttactgt catgccatcc gtaagatgct 6960 tttctgtgac tggtgagtac tcaaccaagt cattctgaga atagtgtatg cggcgaccga 7020 gttgctcttg cccggcgtca acacgggata ataccgcgcc acatagcaga actttaaaag 7080 tgctcatcat tggaaaacgt tcttcggggc gaaaactctc aaggatctta ccgctgttga 7140 gatccagttc gatgtaaccc actcgtgcac ccaactgatc ttcagcatct tttactttca 7200 ccagcgtttc tgggtgagca aaaacaggaa ggcaaaatgc cgcaaaaaag ggaataaggg 7260 cgacacggaa atgttgaata ctcatactct tcctttttca atattattga agcatttatc 7320 agggttattg tctcatgagc ggatacatat ttgaatgtat ttagaaaaat aaacaaatag 7380 gggttccgcg cacatttccc cgaaaagtgc cacctgacgt ctaagaaacc attattatca 7440 tgacattaac ctataaaaat aggcgtatca cgaggccctt tcgtcttcaa 7490 4 7472 DNA Artificial Sequence Synthetic 4 gaattaattc ataccagatc accgaaaact gtcctccaaa tgtgtccccc tcacactccc 60 aaattcgcgg gcttctgcct cttagaccac tctaccctat tccccacact caccggagcc 120 aaagccgcgg cccttccgtt tctttgcttt tgaaagaccc cacccgtagg tggcaagcta 180 gcttaagtaa cgccactttg caaggcatgg aaaaatacat aactgagaat agaaaagttc 240 agatcaaggt caggaacaaa gaaacagctg aataccaaac aggatatctg tggtaagcgg 300 ttcctgcccc ggctcagggc caagaacaga tgagacagct gagtgatggg ccaaacagga 360 tatctgtggt aagcagttcc tgccccggct cggggccaag aacagatggt ccccagatgc 420 ggtccagccc tcagcagttt ctagtgaatc atcagatgtt tccagggtgc cccaaggacc 480 tgaaaatgac cctgtacctt atttgaacta accaatcagt tcgcttctcg cttctgttcg 540 cgcgcttccg ctctccgagc tcaataaaag agcccacaac ccctcactcg gcgcgccagt 600 cttccgatag actgcgtcgc ccgggtaccc gtattcccaa taaagcctct tgctgtttgc 660 atccgaatcg tggtctcgct gttccttggg agggtctcct ctgagtgatt gactacccac 720 gacgggggtc tttcatttgg gggctcgtcc gggatttgga gacccctgcc cagggaccac 780 cgacccacca ccgggaggta agctggccag caacttatct gtgtctgtcc gattgtctag 840 tgtctatgtt tgatgttatg cgcctgcgtc tgtactagtt agctaactag ctctgtatct 900 ggcggacccg tggtggaact gacgagttct gaacacccgg ccgcaaccct gggagacgtc 960 ccagggactt tgggggccgt ttttgtggcc cgacctgagg aagggagtcg atgtggaatc 1020 cgaccccgtc aggatatgtg gttctggtag gagacgagaa cctaaaacag ttcccgcctc 1080 cgtctgaatt tttgctttcg gtttggaacc gaagccgcgc gtcttgtctg ctgcagcgct 1140 gcagcatcgt tctgtgttgt ctctgtctga ctgtgtttct gtatttgtct gaaaattagg 1200 gccagactgt taccactccc ttaagtttga ccttaggtca ctggaaagat gtcgagcgga 1260 tcgctcacaa ccagtcggta gatgtcaaga agagacgttg ggttaccttc tgctctgcag 1320 aatggccaac ctttaacgtc ggatggccgc gagacggcac ctttaaccga gacctcatca 1380 cccaggttaa gatcaaggtc ttttcacctg gcccgcatgg acacccagac caggtcccct 1440 acatcgtgac ctgggaagcc ttggcttttg acccccctcc ctgggtcaag ccctttgtac 1500 accctaagcc tccgcctcct cttcctccat ccgccccgtc tctccccctt gaacctcctc 1560 gttcgacccc gcctcgatcc tccctttatc cagccctcac tccttctcta ggcgccggaa 1620 ttccgatctg atcaagagac aggatgagga tcgtttcgca tgattgaaca agatggattg 1680 cacgcaggtt ctccggccgc ttgggtggag aggctattcg gctatgactg ggcacaacag 1740 acaatcggct gctctgatgc cgccgtgttc cggctgtcag cgcaggggcg cccggttctt 1800 tttgtcaaga ccgacctgtc cggtgccctg aatgaactgc aggacgaggc agcgcggcta 1860 tcgtggctgg ccacgacggg cgttccttgc gcagctgtgc tcgacgttgt cactgaagcg 1920 ggaagggact ggctgctatt gggcgaagtg ccggggcagg atctcctgtc atctcacctt 1980 gctcctgccg agaaagtatc catcatggct gatgcaatgc ggcggctgca tacgcttgat 2040 ccggctacct gcccattcga ccaccaagcg aaacatcgca tcgagcgagc acgtactcgg 2100 atggaagccg gtcttgtcga tcaggatgat ctggacgaag agcatcaggg gctcgcgcca 2160 gccgaactgt tcgccaggct caaggcgcgc atgcccgacg gcgaggatct cgtcgtgacc 2220 catggcgatg cctgcttgcc gaatatcatg gtggaaaatg gccgcttttc tggattcatc 2280 gactgtggcc ggctgggtgt ggcggaccgc tatcaggaca tagcgttggc tacccgtgat 2340 attgctgaag agcttggcgg cgaatgggct gaccgcttcc tcgtgcttta cggtatcgcc 2400 gctcccgatt cgcagcgcat cgccttctat cgccttcttg acgagttctt ctgagcggga 2460 ctctggggtt cgaaatgacc gaccaagcga cgcccaacct gccatcacga gatttcgatt 2520 ccaccgccgc cttctatgaa aggttgggct tcggaatcgt tttccgggac gccggctgga 2580 tgatcctcca gcgcggggat ctcatgctgg agttcttcgc ccaccccggg ctcgatcccc 2640 tcgcgagttg gttcagctgc tgcctgaggc tggacgacct cgcggagttc taccggcagt 2700 gcaaatccgt cggcatccag gaaaccagca gcggctatcc gcgcatccat gcccccgaac 2760 tgcaggagtg gggaggcacg atggccgctt tggtcgaggc ggatccggcc attagccata 2820 ttattcattg gttatatagc ataaatcaat attggctatt ggccattgca tacgttgtat 2880 ccatatcata atatgtacat ttatattggc tcatgtccaa cattaccgcc atgttgacat 2940 tgattattga ctagttatta atagtaatca attacggggt cattagttca tagcccatat 3000 atggagttcc gcgttacata acttacggta aatggcccgc ctggctgacc gcccaacgac 3060 ccccgcccat tgacgtcaat aatgacgtat gttcccatag taacgccaat agggactttc 3120 cattgacgtc aatgggtgga gtatttacgg taaactgccc acttggcagt acatcaagtg 3180 tatcatatgc caagtacgcc ccctattgac gtcaatgacg gtaaatggcc cgcctggcat 3240 tatgcccagt acatgacctt atgggacttt cctacttggc agtacatcta cgtattagtc 3300 atcgctatta ccatggtgat gcggttttgg cagtacatca atgggcgtgg atagcggttt 3360 gactcacggg gatttccaag tctccacccc attgacgtca atgggagttt gttttggcac 3420 caaaatcaac gggactttcc aaaatgtcgt aacaactccg ccccattgac gcaaatgggc 3480 ggtaggcatg tacggtggga ggtctatata agcagagctc gtttagtgaa ccgtcagatc 3540 gcctggagac gccatccacg ctgttttgac ctccatagaa gacaccggga ccgatccagc 3600 ctccgcggcc ccaagcttgt tatcacaagt ttgtacaaaa aagcaggctt cgaaggagat 3660 agaaccaatt ctctaaggaa atacttaacg tcgactggat ccggtaccga attcggcgcc 3720 gccaccatga tgtcctttgt ctctctgctc ctggtaggca tcctattcca tgccacccag 3780 gccgacatcc agctgaccca gagcccaagc agcctgagcg ccagcgtggg tgacagagtg 3840 accatcacct gtaaggccag tcaggatgtg ggtacttctg tagcctggta ccagcagaag 3900 ccaggtaagg ctccaaagct gctgatctac tggacatcca cccggcacac tggtgtgcca 3960 agcagattca gcggtagcgg tagcggtacc gacttcacct tcaccatcag cagcctccag 4020 ccagaggaca tcgccaccta ctactgccag caatatagcc tctatcggtc gttcggccaa 4080 gggaccaagg tggaaatcaa acgaactgtg gctgcaccat ctgtcttcat cttcccgcca 4140 tctgatgagc agttgaaatc tggaactgcc tctgttgtgt gcctgctgaa taacttctat 4200 cccagagagg ccaaagtaca gtggaaggtg gataacgccc tccaatcggg taactcccag 4260 gagagtgtca cagagcagga cagcaaggac agcacctaca gcctcagcag caccctgacg 4320 ctgagcaaag cagactacga gaaacacaaa gtctacgcct gcgaagtcac ccatcagggc 4380 ctgagctcgc ccgtcacaaa gagcttcaac aggggagagt gttagatctc gagatatcta 4440 gacccagctt tcttgtacaa agtggtgata acatcgataa aataaaagat tttatttagt 4500 ctccagaaaa aggggggaat gaaagacccc acctgtaggt ttggcaagct agcttaagta 4560 acgccatttt gcaaggcatg gaaaaataca taactgagaa tagagaagtt cagatcaagg 4620 tcaggaacag atggaacagc tgaatatggg ccaaacagga tatctgtggt aagcagttcc 4680 tgccccggct cagggccaag aacagatgga acagctgaat atgggccaaa caggatatct 4740 gtggtaagca gttcctgccc cggctcaggg ccaagaacag atggtcccca gatgcggtcc 4800 agccctcagc agtttctaga gaaccatcag atgtttccag ggtgccccaa ggacctgaaa 4860 tgaccctgtg ccttatttga actaaccaat cagttcgctt ctcgcttctg ttcgcgcgct 4920 tctgctcccc gagctcaata aaagagccca caacccctca ctcggggcgc cagtcctccg 4980 attgactgag tcgcccgggt acccgtgtat ccaataaacc ctcttgcagt tgcatccgac 5040 ttgtggtctc gctgttcctt gggagggtct cctctgagtg attgactacc cgtcagcggg 5100 ggtctttcat ttgggggctc gtccgggatc gggagacccc tgcccaggga ccaccgaccc 5160 accaccggga ggtaagctgg ctgcctcgcg cgtttcggtg atgacggtga aaacctctga 5220 cacatgcagc tcccggagac ggtcacagct tgtctgtaag cggatgccgg gagcagacaa 5280 gcccgtcagg gcgcgtcagc gggtgttggc gggtgtcggg gcgcagccat gacccagtca 5340 cgtagcgata gcggagtgta tactggctta actatgcggc atcagagcag attgtactga 5400 gagtgcacca tatgcggtgt gaaataccgc acagatgcgt aaggagaaaa taccgcatca 5460 ggcgctcttc cgcttcctcg ctcactgact cgctgcgctc ggtcgttcgg ctgcggcgag 5520 cggtatcagc tcactcaaag gcggtaatac ggttatccac agaatcaggg gataacgcag 5580 gaaagaacat gtgagcaaaa ggccagcaaa aggccaggaa ccgtaaaaag gccgcgttgc 5640 tggcgttttt ccataggctc cgcccccctg acgagcatca caaaaatcga cgctcaagtc 5700 agaggtggcg aaacccgaca ggactataaa gataccaggc gtttccccct ggaagctccc 5760 tcgtgcgctc tcctgttccg accctgccgc ttaccggata cctgtccgcc tttctccctt 5820 cgggaagcgt ggcgctttct catagctcac gctgtaggta tctcagttcg gtgtaggtcg 5880 ttcgctccaa gctgggctgt gtgcacgaac cccccgttca gcccgaccgc tgcgccttat 5940 ccggtaacta tcgtcttgag tccaacccgg taagacacga cttatcgcca ctggcagcag 6000 ccactggtaa caggattagc agagcgaggt atgtaggcgg tgctacagag ttcttgaagt 6060 ggtggcctaa ctacggctac actagaagga cagtatttgg tatctgcgct ctgctgaagc 6120 cagttacctt cggaaaaaga gttggtagct cttgatccgg caaacaaacc accgctggta 6180 gcggtggttt ttttgtttgc aagcagcaga ttacgcgcag aaaaaaagga tctcaagaag 6240 atcctttgat cttttctacg gggtctgacg ctcagtggaa cgaaaactca cgttaaggga 6300 ttttggtcat gagattatca aaaaggatct tcacctagat ccttttaaat taaaaatgaa 6360 gttttaaatc aatctaaagt atatatgagt aaacttggtc tgacagttac caatgcttaa 6420 tcagtgaggc acctatctca gcgatctgtc tatttcgttc atccatagtt gcctgactcc 6480 ccgtcgtgta gataactacg atacgggagg gcttaccatc tggccccagt gctgcaatga 6540 taccgcgaga cccacgctca ccggctccag atttatcagc aataaaccag ccagccggaa 6600 gggccgagcg cagaagtggt cctgcaactt tatccgcctc catccagtct attaattgtt 6660 gccgggaagc tagagtaagt agttcgccag ttaatagttt gcgcaacgtt gttgccattg 6720 ctgcaggcat cgtggtgtca cgctcgtcgt ttggtatggc ttcattcagc tccggttccc 6780 aacgatcaag gcgagttaca tgatccccca tgttgtgcaa aaaagcggtt agctccttcg 6840 gtcctccgat cgttgtcaga agtaagttgg ccgcagtgtt atcactcatg gttatggcag 6900 cactgcataa ttctcttact gtcatgccat ccgtaagatg cttttctgtg actggtgagt 6960 actcaaccaa gtcattctga gaatagtgta tgcggcgacc gagttgctct tgcccggcgt 7020 caacacggga taataccgcg ccacatagca gaactttaaa agtgctcatc attggaaaac 7080 gttcttcggg gcgaaaactc tcaaggatct taccgctgtt gagatccagt tcgatgtaac 7140 ccactcgtgc acccaactga tcttcagcat cttttacttt caccagcgtt tctgggtgag 7200 caaaaacagg aaggcaaaat gccgcaaaaa agggaataag ggcgacacgg aaatgttgaa 7260 tactcatact cttccttttt caatattatt gaagcattta tcagggttat tgtctcatga 7320 gcggatacat atttgaatgt atttagaaaa ataaacaaat aggggttccg cgcacatttc 7380 cccgaaaagt gccacctgac gtctaagaaa ccattattat catgacatta acctataaaa 7440 ataggcgtat cacgaggccc tttcgtcttc aa 7472 

What is claimed is:
 1. An antibody library comprising at least 10² cells, wherein each cell comprises at least one integrated retroviral vector, wherein said retroviral vector expresses an antibody light chain.
 2. The antibody library of claim 1, wherein said library expresses at least 10² unique antibody light chains.
 3. The antibody library of claim 1, wherein said library expresses at least 10³ unique antibody light chains.
 4. The antibody library of claim 1, wherein said library expresses at least 10⁴ unique antibody light chains.
 5. The antibody library of claim 1, wherein said library expresses at least 10⁵ unique antibody light chains.
 6. The antibody library of claim 1, wherein each of said cells comprises exactly one of said integrated retroviral vector.
 7. An antibody library comprising at least 10² cells, wherein each cell comprises at least one integrated retroviral vector, wherein said retroviral vector expresses an antibody heavy chain.
 8. The antibody library of claim 7, wherein said library expresses at least 10² unique antibody heavy chains.
 9. The antibody library of claim 7, wherein said library expresses at least 10³ unique antibody heavy chains.
 10. The antibody library of claim 7, wherein said library expresses at least 10⁴ unique antibody heavy chains.
 11. The antibody library of claim 7, wherein said library expresses at least 10⁵ unique antibody heavy chains.
 12. The antibody library of claim 7, wherein each of said cells comprises exactly one of said integrated retroviral vector.
 13. An antibody library comprising at least 10² cells, wherein each cell comprises at least one of a first integrated retroviral vector and at least one of a second integrated retroviral vector, wherein said first retroviral vector expresses an antibody light chain and said second retroviral vector expresses an antibody heavy chain, and wherein said antibody light chain and said antibody heavy chain associate to form an antibody.
 14. The antibody library of claim 13, wherein said first and second integrated vectors are separately integrated.
 15. The antibody library of claim 13, wherein said library expresses at least 10² unique antibodies.
 16. The antibody library of claim 13, wherein said library expresses at least 10³ unique antibodies.
 17. The antibody library of claim 13, wherein said library expresses at least 10⁴ unique antibodies.
 18. The antibody library of claim 13, wherein said library expresses at least 10⁵ unique antibodies.
 19. The antibody library of claim 13, wherein said cell comprises exactly one of said first integrated retroviral and exactly one of said second integrated retroviral vector.
 20. A retroviral particle library comprising at least 10² retroviral particles, wherein each retroviral particle comprises one antibody light chain gene.
 21. The retroviral particle library of claim 20, wherein said library comprises at least 10² unique antibody light chain genes.
 22. The retroviral particle library of claim 20, wherein said library comprises at least 10³ unique antibody light chain genes.
 23. The retroviral particle library of claim 20, wherein said library comprises at least 10⁴ unique antibody light chain genes.
 24. The retroviral particle library of claim 20, wherein said library comprises at least 10⁵ unique antibody light chain genes.
 25. A retroviral particle library comprising at least 10² retroviral particles, wherein each retroviral particle comprises one antibody heavy chain gene.
 26. The retroviral particle library of claim 25, wherein said library comprises at least 10² unique antibody heavy chain genes.
 27. The retroviral particle library of claim 25, wherein said library comprises at least 10 unique antibody heavy chain genes.
 28. The retroviral particle library of claim 25, wherein said library comprises at least 10⁴ unique antibody heavy chain genes.
 29. The retroviral particle library of claim 25, wherein said library comprises at least 10⁵ unique antibody heavy chain genes.
 30. A retroviral particle library comprising at least 10² retroviral particles, wherein each retroviral particle comprises at least one antibody gene selected from the group consisting of antibody heavy chain genes and antibody light chain genes.
 31. The retroviral particle library of claim 30, wherein said library comprises at least 10² unique antibody genes.
 32. The retroviral particle library of claim 30, wherein said library comprises at least 10³ unique antibody genes.
 33. The retroviral particle library of claim 30, wherein said library comprises at least 10⁴ unique antibody genes.
 34. The retroviral particle library of claim 30, wherein said library comprises at least 10⁵ unique antibody genes.
 35. The retroviral particle library of claim 30, wherein each retroviral particle comprises one antibody heavy chain gene and one antibody light chain gene.
 36. A plasmid library comprising at least 10² plasmids, wherein each plasmid comprises one antibody heavy chain gene inserted into a retroviral vector backbone.
 37. The plasmid library of claim 36, wherein said library comprises at least 10² unique antibody heavy chain genes.
 38. The plasmid library of claim 36, wherein said library comprises at least 10³unique antibody heavy chain genes.
 39. The plasmid library of claim 36, wherein said library comprises at least 10⁴ unique antibody heavy chain genes.
 40. The plasmid library of claim 36, wherein said library comprises at least 10⁵ unique antibody heavy chain genes.
 41. A plasmid library comprising at least 10² plasmids, wherein each plasmid comprises one antibody light chain gene inserted into a retroviral vector backbone.
 42. The plasmid library of claim 41, wherein said library comprises at least 10² unique antibody light chain genes.
 43. The plasmid library of claim 41, wherein said library comprises at least 10³ unique antibody light chain genes.
 44. The plasmid library of claim 41, wherein said library comprises at least 10⁴ unique antibody light chain genes.
 45. The plasmid library of claim 41, wherein said library comprises at least 10⁵ unique antibody light chain genes.
 46. A plasmid library comprising at least 10² plasmids, wherein each plasmid comprises at least one antibody gene selected from the group consisting of antibody heavy chain gene and antibody light chain gene.
 47. The plasmid library of claim 46, wherein said library comprises at least 10² unique antibody genes.
 48. The plasmid library of claim 46, wherein said library comprises at least 10³ unique antibody genes.
 49. The plasmid library of claim 46, wherein said library comprises at least 10⁴ unique antibody genes.
 50. The plasmid library of claim 46, wherein said library comprises at least 10⁵ unique antibody light chain genes.
 51. The plasmid library of claim 46, wherein each plasmid comprises one antibody heavy chain gene and one antibody light chain gene.
 52. A method of generating antibody libraries, comprising: a) providing i) a plurality of first integratable retroviral particles, wherein each of said plurality of retroviral particles comprises one antibody light chain gene; ii) a plurality of second integratable retroviral particles, wherein each of said plurality of retroviral particles comprises one antibody heavy chain gene; and iii) a host cell comprising a genome; and b) contacting said plurality of host cell with said plurality of first and second integratable retroviral particles under conditions such that at least one of said plurality of first integratable retroviral particles and at least one of said plurality of second integratable retroviral particles integrate into said genome of said host cell to generate an antibody library.
 53. The method of claim 52, wherein said plurality of first integratable retroviral particles further comprises a first selectable marker, and said plurality of second integratable retroviral particles further comprises a second selectable marker.
 54. The method of claim 53, wherein said contacting further comprises selecting for the presence of said first and second selectable markers.
 55. The method of claim 52, wherein said antibody library comprises at least 10² unique antibodies.
 56. The method of claim 52, wherein said antibody library comprises at least 10³ unique antibodies.
 57. The method of claim 52, wherein said antibody library comprises at least 10⁴ unique antibodies.
 58. The method of claim 52, wherein said antibody library comprises at least 10⁵ unique antibodies.
 59. The method of claim 52, wherein exactly one of said plurality of first integratable retroviral particles and exactly one of said plurality of second integratable retroviral particles integrate into said genome of said host cell.
 60. The method of claim 52, further comprising step c) screening said antibody library.
 61. The method of claim 60, wherein said screening comprises detecting the ability of antibodies in said antibody library to bind to a pre-selected antigen.
 62. The method of claim 61, wherein said antibodies are bound to the membrane of said host cell and said detecting comprises fluorescence activated cell sorting.
 63. The method of claim 61, wherein said antibodies are secreted and said detecting comprises a solution-based detection assay.
 64. The method of claim 63, wherein said antibodies are diluted into individual containers prior to said detecting.
 65. The method of claim 63, wherein said solution based assay is selected from the group consisting of radioimmunoassay, ELISA (enzyme-linked immunosorbant assay), “sandwich” immunoassays, immunoradiometric assays, immunoprecipitation reactions, agglutination assays (e.g., hemagglutination assays, etc.), complement fixation assays, immunofluorescence assays, and protein A assays.
 66. A method of screening antibody libraries, comprising: a) providing i) an antibody library comprising comprises at least 10² unique antibodies; and ii) a pre-selected antigen; and b) screening said antibody library, wherein said screening comprises detecting the ability of said at least 10² unique antibodies to bind to said pre-selected antigen.
 67. The method of claim 66, wherein said antibody library comprises at least 10³ unique antibodies.
 68. The method of claim 66, wherein said antibody library comprises at least 10⁴ unique antibodies.
 69. The method of claim 66, wherein said antibody library comprises at least 10⁵ unique antibodies.
 70. The method of claim 66, wherein said antibodies are bound to the membrane of a host cell and said detecting comprises fluorescence activated cell sorting.
 71. The method of claim 66, wherein said antibodies are secreted and said detecting comprises a solution-based detection assay.
 72. The method of claim 71, wherein said antibodies are diluted into individual containers prior to said detecting.
 73. The method of claim 71, wherein said solution based assay is selected from the group consisting of radioimmunoassay, ELISA (enzyme-linked immunosorbant assay), “sandwich” immunoassays, immunoradiometric assays, immunoprecipitation reactions, agglutination assays (e.g., hemagglutination assays, etc.), complement fixation assays, immunofluorescence assays, and protein A assays.
 74. A method, comprising a) providing i) a plurality of first integratable retroviral particles, wherein each of said plurality of retroviral particles comprises one antibody light chain gene; ii) a plurality of second integratable retroviral particles, wherein each of said plurality of retroviral particles comprises one antibody heavy chain gene; and iii) a host cell comprising a genome; and iv) a pre-selected antigen; and b) contacting said plurality of host cell with said plurality of first and second integratable retroviral particles under conditions such that at least one of said plurality of first integratable retroviral particles and at least one of said plurality of second integratable retroviral particles integrate into said genome of said host cell to generate an antibody library comprising a plurality of antibodies; and c) screening said antibody library, wherein said screening comprises detecting the ability said antibodies to bind to said pre-selected antigen.
 75. The method of claim 74, wherein said antibody library comprises at least 10² unique antibodies.
 76. The method of claim 74, wherein said antibody library comprises at least 10³ unique antibodies.
 77. The method of claim 74, wherein said antibody library comprises at least 10⁴ unique antibodies.
 78. The method of claim 74, wherein said antibody library comprises at least 10⁵ unique antibodies.
 79. The method of claim 74, wherein said plurality of first integratable retroviral particles further comprises a first selectable marker, and said plurality of second integratable retroviral particles further comprises a second selectable marker.
 80. The method of claim 79, wherein said contacting further comprises selecting for the presence of said first and second selectable markers.
 81. The method of claim 74, wherein said antibodies are bound to the membrane of said host cell and said detecting comprises fluorescence activated cell sorting.
 82. The method of claim 74, wherein said antibodies are secreted and said detecting comprises a solution-based detection assay.
 83. The method of claim 82, wherein said antibodies are diluted into individual containers prior to said detecting.
 84. The method of claim 82, wherein said solution based assay is selected from the group consisting of radioimmunoassay, ELISA (enzyme-linked immunosorbant assay), “sandwich” immunoassays, immunoradiometric assays, immunoprecipitation reactions, agglutination assays (e.g., hemagglutination assays, etc.), complement fixation assays, immunofluorescence assays, and protein A assays. 